python difflib比较内容之间的差异

之前一直在寻找比较内容差异的库,原来python标准库里自带有difflib库

这就比较有意思了,对于数据采集来说比较两次请求参数的变化就很有用了,可以知道哪些是变化的,方便定位比较

import difflib
def diff_headers():
	text1 ='''Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
		Accept-Encoding: gzip, deflate
		Accept-Language: q=0.9,en;q=0.8,en-US;q=0.7;zh-CN,zh;
		Cache-Control: no-cache
		Connection: keep-alive
		Cookie: UM_distinctid=17c5f7e8e37f8b-030342123ea219-513c1743-15f900-17c2f7e8e38463; CNZZDATA1586682=cnzz_eid%3D1569740215-1636510718-null%26ntime%3D1642568049; PHPSESSID=l5otho4quql6jpf7majg5795fs; _stat_uid=05967439303530977045856681345587735
		Host: www.chem365.net
		Pragma: no-cache
		Referer: http://www.chem365.net/
		Upgrade-Insecure-Requests: 1
		User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36 Edg/98.0.1108.50'''.splitlines(keepends=True)

	text2 = ''' Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
		Accept-Encoding: gzip, deflate
		Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-US;q=0.7
		Cache-Control: no-cache
		Connection: keep-alive
		Cookie: UM_distinctid=17c2f7e8e37f8b-030342123ea219-513c1743-15f900-17c2f7e8e38463; CNZZDATA1586682=cnzz_eid%3D1569740215-1636510718-null%26ntime%3D1642568049; PHPSESSID=l5otho4quql6jpf7majg5795fs; _stat_uid=05967439303530977045856681345587735
		Host: www.chem365.net
		Pragma: no-cache
		Referer: http://www.chem365.net/web/index/information/classid/142.html
		Upgrade-Insecure-Requests: 1
		User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36 Edg/98.0.1108.50
	     '''.splitlines(keepends=True)

	d = difflib.HtmlDiff()
	htmlContent = d.make_file(text1,text2)
	# print(htmlContent)
	with open('diff_header.html','w') as f:
	    f.write(htmlContent)

if __name__ == '__main__':
	# diff_html()
	diff_headers()

  

python difflib比较内容之间的差异

 

 如图是根据生成的html可以清晰的看到内容的变动(不同的颜色代表不同的动作),这样做比较久很容易看出来了

更详细的内容可以参考: https://blog.csdn.net/weixin_45775963/article/details/104122753

上一篇:03_Django-GET请求和POST请求-设计模式及模板层


下一篇:高性能、高可用、免运维-云原生Prometheus方案与实践