在python中解析日志文件

我有一个日志文件,其中包含如下所示的行:

"1","2546857-23541","f_last","user","4:19 P.M.","11/02/2009","START","27","27","3","c2546857-23541",""

日志中每行12个双引号部分和字符串的第7个双引号部分来自用户在聊天窗口中键入内容的位置:

"22","2546857-23541","f_last","john","4:38 P.M.","11/02/2009","
What's up","245","47","1","c2546857-23541",""

此字符串还显示了我遇到的问题;聊天日志中的某些区域,用户键入的文本位于日志文件中的新行中,而不是像第一个示例中的同一行.
因此,基本上,我希望第二个示例中的行看起来像第一个示例.

我尝试在N中使用“查找/替换”,但能够找到每个“孤立”行,但无法使其加入其上方的行.
然后我想到制作一个python文件来为我自动化,但是我对如何实际编写代码有些困惑.

运行unutbu的代码的这一行出现了Python错误

"1760","4746880-00129","bwhiteside","tom","11:47 A.M.","12/10/2009","I do not see ^"refresh your knowledge
^" on the screen","422","0","0","c4746871-00128",""

解决方法:

csv module足够聪明,可以识别带引号的项目何时未完成(因此必须包含换行符).

import csv
with open('data.log',"r") as fin:
    with open('data2.log','w') as fout:        
        reader=csv.reader(fin,delimiter=',', quotechar='"', escapechar='^')
        writer=csv.writer(fout, delimiter=',', 
                          doublequote=False, quoting=csv.QUOTE_ALL)
        for row in reader:
            row[6]=row[6].replace('\n',' ')
            writer.writerow(row)
上一篇:《Oracle Java SE编程自学与面试指南》02-01:Notepad++


下一篇:ASP.NET MVC中利用AuthorizeAttribute实现访问身份是否合法以及Cookie过期问题的处理