本次的内容为python的应用,关于日期、文件、词云统计应用,均多应用对象思想,及字典。
习题一
要求:1.初始化start_day,end_day两个日期
from datetime import datetime
start_day=datetime(2019,4,1)
end_day=datetime(2019,4,30)
其它时间数据生成要用datetime或date模块的方法编程实现
2.不能使用calendar模块生成
以下是代码内容:
1 from datetime import * 2 3 start_day=datetime(2019,4,1) 4 end_day=datetime(2019,4,30) 5 day=end_day-start_day#记总天数 6 7 print(start_day.strftime('\t\t\t%Y/%m'))#输出年份与月份 8 print("周日\t周一\t周二\t周三\t周四\t周五\t周六") 9 10 first_day=start_day.weekday()#第一日的是周几 11 count=0#计是否换行数 12 space=0#计空格数 13 14 #第一天前面的空格数 15 while space <= first_day: 16 space += 1 #空格数控制格式 17 print("\t", end="") 18 count += 1 #计换行数控制格式 19 20 that_day = 1#计第一天为一号 21 while that_day <= day.days:#显示每天 22 print(that_day,end="\t") 23 that_day += 1 24 count += 1 25 if (count % 7 == 0):#每计七个数进行换行 26 print("\n")
以下是运行结果:
本题更多的是格式上的规划,通过循环,控制输入与格式达到输出结果,呈现出想要的格式。
代码中有可以通过更改起始时间及结束时间来,控制该输出。
题目不难,更多的是逻辑上要清晰,考虑好循环的内容。
习题二
要求:1.参考“三国演义”词频统计程序,实现对红楼梦出场人物的频次统计。
2.将红楼梦出场人物的频次统计结果用词云显示。
以下是代码内容:
1 import jieba 2 excludes = {"什么","一个","我们","那里","你们","如今","说道","知道","起来","这里","出来","他们","众人","自己", 3 "奶奶","一面","只见","怎么","姑娘","两个","没有","不是","不知","这个","听见","这样","进来","这是", 4 "告诉","就是","咱们","东西","回来","只是","大家","老爷","只得","丫头","这些","不敢","出去","所以", 5 "不过","的话","不好","姐姐"} 6 txt = open("红楼梦.txt", "r", encoding='utf8').read() #打开文件并定义 7 8 words = jieba.lcut(txt) 9 10 counts = {} #定义字典 11 12 for word in words: 13 if len(word) == 1: 14 continue 15 elif (word == "宝玉" or word == "宝玉道"or word == "宝二爷" 16 or word == "混世魔王"or word == "怡红公子"or word == "绛洞花主" 17 or word == "无事忙"or word == "遮天大王"or word == "富贵闲人"or word =="贾宝玉"): 18 rword = "贾宝玉" 19 elif word == "黛玉" or word == "黛玉道"or word =="林黛玉": 20 rword = "林黛玉" 21 elif word == "宝钗" or word == "宝钗道"or word =="薛宝钗": 22 rword = "薛宝钗" 23 elif word == "姨太太" or word == "薛姨妈": 24 rword = "薛姨妈" 25 elif word == "老祖宗" or word == "老太太"or word == "史太君"or word =="贾母": 26 rword = "贾母" 27 elif word == "太太" or word == "二太太": 28 rword = "王夫人" 29 elif word == "熙凤" or word == "熙凤道"or word == "凤姐"or word == "凤姐儿"or word == "王熙凤": 30 rword = "王熙凤" 31 elif word == "平儿" or word == "袭人"or word == "小平": 32 rword = "平儿" 33 elif word == "探春" or word == "探春道": 34 rword = "贾探春" 35 elif word == "晴雯" or word == "勇晴雯"or word == "芙蓉仙子"or word == "病西施": 36 rword = "晴雯" 37 else: 38 rword = word 39 counts[rword] = counts.get(rword, 0) + 1 #词汇加入字典 40 41 #从字典中删除无用词 42 for word in excludes: 43 del (counts[word]) 44 45 #字典转换为列表 46 items = list(counts.items()) 47 48 #lambda是一个隐函数,是固定写法 49 items.sort(key=lambda x: x[1], reverse=True) 50 51 for i in range(10): #出现的词频统计 52 word, count = items[i] #将键和值分别赋予列表word和count 53 print("{0:<10}{1:>7}".format(word, count)) #0:<10左对齐,宽度10,”>5"右对齐
以下是运行结果:
本题更多的是在对代码原理的理解后对,词云统计的使用。
根据源代码,进行修改,通过增加限制条件,
限制词云统计中的词汇,来搜索出你想要的对应信息的数据。
在每一次运行结束,通过在exclude中的增加词语,来规避,不想要的数据。
通过增加IF、elif条件判断来使数据更合理更符合预期。
代码内容不难,更多的在于理解代码内容,收集信息,与不厌其烦地去修改筛选条件。
本次习题结束。
所以说很多时候不是你不会,只是缺少更多的思考,更多的细心罢了。