据说2014年Python语言很热门,这跟它专注与数据打交道有很大关系。
那么如何使用Python来处理数据呢?
首先,我们要利用Python来打开文档,然后读取数据,再处理数据,最后输出数据。
下面利用一个HeadFirstPython一书中的例子来说明,以此为学习笔记。
首先导入‘os‘模块,并把当前工作目录切换到包含数据文件的那个文件夹。
>>> import os#导入os模块 >>> os.getcwd()#获取当前工作目录 ‘C:\\Python33‘ >>> os.chdir(‘D:\Python\HeadFirstPython\Chapter3‘)#切换当前工作目录 >>> os.getcwd() ‘D:\\Python\\HeadFirstPython\\Chapter3‘然后打开数据文件,从文件读取前两行,并在屏幕中显示出来。
>>> data=open(‘sketch.txt‘)#打开一个命名文件,将文件赋至一个"data"的文件对象 >>> print(data.readline(),end=‘‘)#使用"readline()"方法从文件获取一个数据行,然后使用"print()"BIF在屏幕上显示这个数据行 Man: Is this the right room for an argument? >>> print(data.readline(),end=‘‘) Other Man: I‘ve told you once.再”退回“到文件起始位置,然后使用for语句处理文件中的每一行。
>>> data.seek(0) 0 >>> for each_line in data: print(each_line,end=‘‘)最后关闭文件。
>>>data.close()通过上述程序,即可读取出文件中的每一行数据(文字)。
Man: Is this the right room for an argument? Other Man: I‘ve told you once. Man: No you haven‘t! Other Man: Yes I have. Man: When? Other Man: Just now. Man: No you didn‘t! Other Man: Yes I did! Man: You didn‘t! Other Man: I‘m telling you, I did! Man: You did not! Other Man: Oh I‘m sorry, is this a five minute argument, or the full half hour? Man: Ah! (taking out his wallet and paying) Just the five minutes. Other Man: Just the five minutes. Thank you. Other Man: Anyway, I did. Man: You most certainly did not! Other Man: Now let‘s get one thing quite clear: I most definitely told you! Man: Oh no you didn‘t! Other Man: Oh yes I did! Man: Oh no you didn‘t! Other Man: Oh yes I did! Man: Oh look, this isn‘t an argument! (pause) Other Man: Yes it is! Man: No it isn‘t! (pause) Man: It‘s just contradiction! Other Man: No it isn‘t! Man: It IS! Other Man: It is NOT! Man: You just contradicted me! Other Man: No I didn‘t! Man: You DID! Other Man: No no no! Man: You did just then! Other Man: Nonsense! Man: (exasperated) Oh, this is futile!! (pause) Other Man: No it isn‘t! Man: Yes it is!通过分析数据,我们可以发现这些数据遵循某种特定的格式:演员角色 冒号 演员讲的台词
下一节,我们将尝试把数据行中的各个部分提取出来。