一些常见的中文语料库资源:
中文语料库汇总链接
直接上代码,是为了将别的语料库导入到chatterbot中。
#!/usr/bin/python3
# -*- coding: utf-8 -*-
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer
#创建一个机器人叫tom
chatbot = ChatBot('Charlie')
trainer = ListTrainer(chatbot)
#语料训练
trainer.train([
"我也爱你",
"你爱我什么呢",
"对不起,你是个好人"
])
with open('corpus.txt', 'r', encoding='UTF-8') as f:
for line in f:
question = line
response = chatbot.get_response(question)
#print ('ok')
#print(question)
#response = chatbot.get_response(question)
#print(response)
'''
while True:
try:
strr = input("请输入:")
response = chatbot.get_response(strr)
print(response)
except(KeyboardInterrupt, EOFError, SystemExit):
break
'''
因为语料库数据打,训练的实践可能会很长(几个小时到两天都正常),所以必须考虑之后训练的执行效率问题。持续更新,原理+实践