第四次作业 (日期和jieba库的运用)

设计题1:

设计一个本月份日历,输出格式如下:

 

第四次作业 (日期和jieba库的运用)

要求:
1.初始化start_day,end_day两个日期
from datetime import datetime
start_day=datetime(2019,4,1)
end_day=datetime(2019,4,30)
其它时间数据生成要用datetime或date模块的方法编程实现

2.不能使用calendar模块生成

'''要求:
1.初始化start_day,end_day两个日期
from datetime import datetime
start_day=datetime(2019,4,1)
end_day=datetime(2019,4,30)
其它时间数据生成要用datetime或date模块的方法编程实现
2.不能使用calendar模块生成'''
from datetime import datetime
from datetime import timedelta

start_day = datetime(2019, 4, 1)
end_day = datetime(2019, 4, 30)
day = end_day - start_day

month = start_day.month  #打印的月份
week = start_day.weekday()  #获取当月第一天是星期几
day_1 = day.days + 1  #4月的天数

count = 0
n = 0
print("\t\t 2019年4月")
print("日\t一\t二\t三\t四\t五\t六")

#第一周前面的空格数
while n <= week:
    n += 1
    print("\t", end="")
    count += 1   #空格数也要算上,和下面的 count += 1 共同控制格式

d = 1
while d <= day_1:  #显示天数
    print(d, end="\t")
    d += 1
    count += 1
    if (count % 7 == 0):
        print("\n")  #count=7进行换行

 

运行结果:

第四次作业 (日期和jieba库的运用)

 

 

设计题2:

1.参考“三国演义”词频统计程序,实现对红楼梦出场人物的频次统计。
2.(可选)
将红楼梦出场人物的频次统计结果用词云显示。

'''设计题2:

1.参考“三国演义”词频统计程序,实现对红楼梦出场人物的频次统计。
2.(可选)
将红楼梦出场人物的频次统计结果用词云显示'''
import jieba
excludes = {"什么", "一个", "我们", "你们", "如今", "说道","知道", "出来", "那里", "起来", "姑娘", "这里",
             "他们", "众人", "自己", "一面", "太太", "老太太", "只见", "怎么", "两个","过来","心里","二爷",
            "没有", "不是", "不知", "这个", "这样", "听见", "进来", "咱们", "告诉", "就是","如此","今日",
            "东西", "奶奶", "回来", "只是", "老爷", "大家","不好","姐姐","一时","不能","鸳鸯","银子","几个",
            "只得", "丫头", "这些", "不敢", "出去", "所以","王夫人","平儿","袭人","薛姨妈","不过","的话",
            "答应","二人","还有","贾政","只管","这么","说话","一回","那边","湘云","这话","外头","打发","自然",
            "今儿","罢了","屋里","那些","听说"}
txt = open("D:\work\红楼梦.txt", "r", encoding='utf8').read() #打开文件并定义

words = jieba.lcut(txt)

counts = {}  #定义字典

for word in words:
    if len(word) == 1:
        continue
    elif word == "贾宝玉" or word == "宝玉道":
        rword = "宝玉"
    elif word == "林黛玉" or word == "黛玉道":
        rword = "黛玉"
    elif word == "薛宝钗" or word == "宝钗":
        rword = "宝钗"
    elif word == "贾元春" or word == "元春":
        rword = "元春"
    elif word == "贾探春" or word == "探春":
        rword = "探春"
    elif word == "贾惜春" or word == "惜春":
        rword = "惜春"
    elif word == "王熙凤" or word == "熙凤道" or word == "凤姐道" or word == "凤姐儿" or word == "凤姐":
        rword = "熙凤"
    elif word == "秦可卿" or word == "可卿":
        rword = "可卿"
    elif word == "刘姥姥道" or word == "刘姥姥":
        rword = "刘姥姥"
    elif word == "晴雯" or word == "晴雯道":
        rword = "晴雯"
    else:
        rword = word
    counts[rword] = counts.get(rword, 0) + 1  #词汇加入字典

#从字典中删除无用词
for word in excludes:
    del (counts[word])

#字典转换为列表
items = list(counts.items())

#lambda是一个隐函数,是固定写法
items.sort(key=lambda x: x[1], reverse=True)

for i in range(10):  #出现的词频统计
    word, count = items[i]  #将键和值分别赋予列表word和count
    print("{0:<10}{1:>7}".format(word, count))  #0:<10左对齐,宽度10,”>5"右对齐

 

结果:

第四次作业 (日期和jieba库的运用)

 

码云地址:https://gitee.com/yeshenshi/Python.git

 

上一篇:准确分词之动态调整词频和字典顺序


下一篇:jieba库