为了账号安全,请及时绑定邮箱和手机立即绑定

有没有办法分析文本文件来检查这个标准

有没有办法分析文本文件来检查这个标准

当年话下 2023-07-11 10:32:46
我需要创建一个程序来分析文件中的一段文本,然后进行计数:多少字单词的平均长度每个单词出现多少次字母表中每个字母开头有多少个单词到目前为止,我已经成功完成了前两个要点(如下所示),fileName = open(input('Please enter the full name of the file: '), 'r')     w = [len(word) for line in fileName for word in line.rstrip().split(" ")]    total_w = len(w)    avg_w = sum(w) / total_w          print('The total number of words in this file is:', total_w)  print('The average length of the words in this file is:', avg_w)
查看完整描述

1 回答

?
幕布斯6054654

TA贡献1876条经验 获得超7个赞

collections.Counter使得这相对简单。我用来re.findall(r'[\w]+', data)查找单词(单词是带有字母、下划线和数字的东西)。根据需要进行调整。

import re

from collections import Counter


fn = input('Please enter the full name of the file: ')

with open(fn, 'r') as f:

    words = Counter(re.findall(r'[\w]+', f.read()))

    # use words = Counter(f.read().split()) if everything split by spaces

    # adjust regular expression depending on whether you want or don't want

    # stuff like numbers to be counted as "words"


print('Total number of words:', sum(words.values()))

# this is weighted by word occurrence, not sure whether this is correct

print('Average length of words:', 

      sum(len(w) * o for w, o in words.items()) / sum(words.values()))

print('Word occurrence:', words)

# this only shows letters that actually occur. If you need all letters of 

# the alphabet, you have to add the rest

print('Start letter occurrence', Counter(w[0] for w in words.elements()))


查看完整回答
反对 回复 2023-07-11
  • 1 回答
  • 0 关注
  • 54 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信