我有一个日志文件,其中包含如下所示的行:
"1","2546857-23541","f_last","user","4:19 P.M.","11/02/2009","START","27","27","3","c2546857-23541",""
日志中每行12个双引号部分和字符串的第7个双引号部分来自用户在聊天窗口中键入内容的位置:
"22","2546857-23541","f_last","john","4:38 P.M.","11/02/2009","
What's up","245","47","1","c2546857-23541",""
此字符串还显示了我遇到的问题;聊天日志中的某些区域,用户键入的文本位于日志文件中的新行中,而不是像第一个示例中的同一行.
因此,基本上,我希望第二个示例中的行看起来像第一个示例.
我尝试在N中使用“查找/替换”,但能够找到每个“孤立”行,但无法使其加入其上方的行.
然后我想到制作一个python文件来为我自动化,但是我对如何实际编写代码有些困惑.
运行unutbu的代码的这一行出现了Python错误
"1760","4746880-00129","bwhiteside","tom","11:47 A.M.","12/10/2009","I do not see ^"refresh your knowledge
^" on the screen","422","0","0","c4746871-00128",""
解决方法:
csv module足够聪明,可以识别带引号的项目何时未完成(因此必须包含换行符).
import csv
with open('data.log',"r") as fin:
with open('data2.log','w') as fout:
reader=csv.reader(fin,delimiter=',', quotechar='"', escapechar='^')
writer=csv.writer(fout, delimiter=',',
doublequote=False, quoting=csv.QUOTE_ALL)
for row in reader:
row[6]=row[6].replace('\n',' ')
writer.writerow(row)