4 持久存储:文件存储、读写
数据保存到文件:在学习的过程中出现了一个问题,老是报一个错:SyntaxError: invalid syntax;
这个是语法错误,后来搜了下才知道是python2.7和python3.5并不兼容,因为之前一直是在ubuntu的终端里
写这些简单的实例,后来程序稍微大点就不方便了,就安装了idle,用命令:sudo apt-get install idle,安装完启动后,
载入python文件,然后运行发现是python2.7,然后逐行运行,发现报错,而之前这些代码都是没问题的,后来重新安
装idle3,命令:sudo apt-get install idle3,然后启动:idle3,运行实例代码,没有问题。
实例一:
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "copyright", "credits" or "license()" for more information.
>>> import os
>>> os.getcwd()
'/home/user'
>>> os.chdir('/home/user/project/python_model/HeadFirstPython/chapter3')
>>> os.getcwd()
'/home/user/project/python_model/HeadFirstPython/chapter3'
>>> man=[]
>>> other=[]
>>> try:
data=open('sketch.txt')
for each_line in data:
try:
(role,line_spoken)=each_line.split(':',1)
line_spoken=line_spoken.strip()
if role=='Man':
man.append(line_spoken)
elif role=='Other Man':
other.append(line_spoken)
except ValueError:
pass
data.close()
except IOError:
print('The datafile is missing!') >>> print(man)
['Is this the right room for an argument?', "No you haven't!", 'When?', "No you didn't!", "You didn't!", 'You did not!', 'Ah! (taking out his wallet and paying) Just the five minutes.', 'You most certainly did not!', "Oh no you didn't!", "Oh no you didn't!", "Oh look, this isn't an argument!", "No it isn't!", "It's just contradiction!", 'It IS!', 'You just contradicted me!', 'You DID!', 'You did just then!', '(exasperated) Oh, this is futile!!', 'Yes it is!']
>>> print(other)
["I've told you once.", 'Yes I have.', 'Just now.', 'Yes I did!', "I'm telling you, I did!", "Oh I'm sorry, is this a five minute argument, or the full half hour?", 'Just the five minutes. Thank you.', 'Anyway, I did.', "Now let's get one thing quite clear: I most definitely told you!", 'Oh yes I did!', 'Oh yes I did!', 'Yes it is!', "No it isn't!", 'It is NOT!', "No I didn't!", 'No no no!', 'Nonsense!', "No it isn't!"]
>>>
以写模式打开文件
使用open()BIF打开磁盘文件时,可以指定访问的模式,open()的帮助文件如下:
help(open)
Help on built-in function open in module io: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
Open file and return a stream. Raise IOError upon failure. file is either a text or byte string giving the name (and the path
if the file isn't in the current working directory) of the file to
be opened or an integer file descriptor of the file to be
wrapped. (If a file descriptor is given, it is closed when the
returned I/O object is closed, unless closefd is set to False.) mode is an optional string that specifies the mode in which the file
is opened. It defaults to 'r' which means open for reading in text
mode. Other common values are 'w' for writing (truncating the file if
it already exists), 'x' for creating and writing to a new file, and
'a' for appending (which on some Unix systems, means that all writes
append to the end of the file regardless of the current seek position).
In text mode, if encoding is not specified the encoding used is platform
dependent: locale.getpreferredencoding(False) is called to get the
current locale encoding. (For reading and writing raw bytes use binary
mode and leave encoding unspecified.) The available modes are: ========= ===============================================================
Character Meaning
--------- ---------------------------------------------------------------
'r' open for reading (default)
'w' open for writing, truncating the file first
'x' create a new file and open it for writing
'a' open for writing, appending to the end of the file if it exists
'b' binary mode
't' text mode (default)
'+' open a disk file for updating (reading and writing)
'U' universal newline mode (deprecated)
========= =============================================================== The default mode is 'rt' (open for reading text). For binary random
access, the mode 'w+b' opens and truncates the file to 0 bytes, while
'r+b' opens the file without truncation. The 'x' mode implies 'w' and
raises an `FileExistsError` if the file already exists. Python distinguishes between files opened in binary and text modes,
even when the underlying operating system doesn't. Files opened in
binary mode (appending 'b' to the mode argument) return contents as
bytes objects without any decoding. In text mode (the default, or when
't' is appended to the mode argument), the contents of the file are
returned as strings, the bytes having been first decoded using a
platform-dependent encoding or using the specified encoding if given. 'U' mode is deprecated and will raise an exception in future versions
of Python. It has no effect in Python 3. Use newline to control
universal newlines mode. buffering is an optional integer used to set the buffering policy.
Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
line buffering (only usable in text mode), and an integer > 1 to indicate
the size of a fixed-size chunk buffer. When no buffering argument is
given, the default buffering policy works as follows: * Binary files are buffered in fixed-size chunks; the size of the buffer
is chosen using a heuristic trying to determine the underlying device's
"block size" and falling back on `io.DEFAULT_BUFFER_SIZE`.
On many systems, the buffer will typically be 4096 or 8192 bytes long. * "Interactive" text files (files for which isatty() returns True)
use line buffering. Other text files use the policy described above
for binary files. encoding is the name of the encoding used to decode or encode the
file. This should only be used in text mode. The default encoding is
platform dependent, but any encoding supported by Python can be
passed. See the codecs module for the list of supported encodings. errors is an optional string that specifies how encoding errors are to
be handled---this argument should not be used in binary mode. Pass
'strict' to raise a ValueError exception if there is an encoding error
(the default of None has the same effect), or pass 'ignore' to ignore
errors. (Note that ignoring encoding errors can lead to data loss.)
See the documentation for codecs.register or run 'help(codecs.Codec)'
for a list of the permitted encoding error strings. newline controls how universal newlines works (it only applies to text
mode). It can be None, '', '\n', '\r', and '\r\n'. It works as
follows: * On input, if newline is None, universal newlines mode is
enabled. Lines in the input can end in '\n', '\r', or '\r\n', and
these are translated into '\n' before being returned to the
caller. If it is '', universal newline mode is enabled, but line
endings are returned to the caller untranslated. If it has any of
the other legal values, input lines are only terminated by the given
string, and the line ending is returned to the caller untranslated. * On output, if newline is None, any '\n' characters written are
translated to the system default line separator, os.linesep. If
newline is '' or '\n', no translation takes place. If newline is any
of the other legal values, any '\n' characters written are translated
to the given string. If closefd is False, the underlying file descriptor will be kept open
when the file is closed. This does not work when a file name is given
and must be True in that case. A custom opener can be used by passing a callable as *opener*. The
underlying file descriptor for the file object is then obtained by
calling *opener* with (*file*, *flags*). *opener* must return an open
file descriptor (passing os.open as *opener* results in functionality
similar to passing None). open() returns a file object whose type depends on the mode, and
through which the standard file operations such as reading and writing
are performed. When open() is used to open a file in a text mode ('w',
'r', 'wt', 'rt', etc.), it returns a TextIOWrapper. When used to open
a file in a binary mode, the returned class varies: in read binary
mode, it returns a BufferedReader; in write binary and append binary
modes, it returns a BufferedWriter, and in read/write mode, it returns
a BufferedRandom. It is also possible to use a string or bytearray as a file for both
reading and writing. For strings StringIO can be used like a file
opened in a text mode, and for bytes a BytesIO can be used like a file
opened in a binary mode.
View help
实例二:
import os
os.getcwd()
os.chdir('/home/user/project/python_model/HeadFirstPython/chapter3')
man = []
other = []
try:
data = open('sketch.txt')
for each_line in data:
try:
(role,line_spoken) = each_line.split(':',1)
line_spoken = line_spoken.strip()
if role == 'Man':
man.append(line_spoken)
elif role == 'Other Man':
other.append(line_spoken)
except ValueError:
pass
data.close()
except IOError:
print('The datafile is missing!')
try:
man_file = open('man_data.txt','w') # open a new file man_data.txt in-mode 'w'
other_file = open('other_data.txt','w')# if the file don't exist then creat it.
print(man,file=man_file)# write man data into man_file.txt
print(other,file=other_file)# write other data into other_file.txt
man_file.close()# close man_file
other_file.close()# close other_file
except IOError:
print('File error')
注:发生异常后文件会保持打开
为了解决发生异常文件没有自动关闭的问题,引入finally。
用finally扩展try
在实例二的最后增加:
finally:
man_file.close()
other_file.close()
在python中字符串是不可变的,因为永远不知道还有哪些变量指向某个特定的字符串;
尽管可以为Python变量赋数值,但实际上变量并不包含所赋的数据;
此外,还有元组也不可以改变,即:不可改变的列表;
所有数值类型也是不可变的。
知道错误类型还不够
如果想知道产生错误的具体原因,就需要添加异常处理捕获机制,如下:
假设现在要打开一个文件:missing.txt,但这个文件并不存在,如下代码:
try:
data=open('missing.txt')
print(data.readline(),end='')
except IOError:
print('File error')
finally:
if 'data' in locals():
data.close()
继续改进:
except IOError as err: #为异常对象起一个名
print('File error: ' + str(err)) #然后作为错误消息的一部分
然后运行,结果是:File error:[Errno 2] No such file or directory: 'missing.txt';
但是如果代码量大了,这种逻辑处理方法会很麻烦,这样引入with。
用with处理文件
使用以下代码可以替代上面的try/except/finally代码:
try:
with open('its.txt',"w") as data:
print("It's...",file=data)
except IOError as err:
print('File error:' + str(err))
注:使用with时,不需要操心关闭打开文件,Python解释器会自动处理;
其实,with语句使用了一种名叫:上下文管理协议(context management protocol)的Python技术。
接下来修改第二章笔记中的print_lol()函数
在Python中,标准输出是:sys.stdout,可以从标准库sys模块导入。
实例三
对函数print_lol做修改
def print_lol(the_list,indent=False,level=0,fh=sys.stdout ):
for each_item in the_list:
if isinstance(each_item,list):
print_lol(each_item,indent,level+1,fh)
else:
for tab_stop in range(level):
print("\t" *level,end='',file=fh)
print(each_item,file=fh)
不知道为什么,print_lol函数在添加了第四个参数fh=sys.stdout后,用import sys及import nester后报错:
Traceback (most recent call last):
File "<pyshell#9>", line 1, in <module>
import nester
File "/home/user/project/python_model/nester/nester.py", line 1, in <module>
def print_lol(the_list,indent=False,level=0,fh=sys.stdout ):
NameError: name 'sys' is not defined
上网查找也没有解决这个问题,挺郁闷的,已经卡住两天了,先跳过去了。。。
定制代码剖析
“腌制”数据
Python提供了一个标准库,名为:pickle,它可以保存和加载几乎任何Python数据对象,包括列表。
可以把“腌制”数据存储到磁盘,放到数据库或者通过网络传输到另一台计算机上。
用dump保存,用load恢复
使用pickle很简单,只需导入模块:import pickle;
用dump()保存数据;
用load()恢复数据;
注:处理“腌制数据”,唯一的要求是,必须以二进制访问模式打开这些文件。
如果出问题了呢?
腌制或解除数据腌制时如果出了问题,pickle模块会产生一个PickleError类型的异常。
实例四:文件数据的腌制和恢复
>>> print(man)
['Is this the right room for an argument?', "No you haven't!", 'When?', "No you didn't!", "You didn't!", 'You did not!', 'Ah! (taking out his wallet and paying) Just the five minutes.', 'You most certainly did not!', "Oh no you didn't!", "Oh no you didn't!", "Oh look, this isn't an argument!", "No it isn't!", "It's just contradiction!", 'It IS!', 'You just contradicted me!', 'You DID!', 'You did just then!', '(exasperated) Oh, this is futile!!', 'Yes it is!']
>>> try:
man_file=open('man_data.txt','w')
other_file=open('other_data.txt','w')
print(man,file=man_file)
print(other,file=other_file)
man_file.close()
other_file.close()
except IOError:
print('File error') >>> import pickle
>>> try:
with open('man_data1.txt','wb')as man_file:
pickle.dump(man,man_file)
except IOError as err:
print('File error:'+str(err))
except pickle.PickleError as perr:
print('Pickling error:'+str(perr)) >>> new_man=[]
>>> try:
with open('man_data1.txt','rb')as man_file:
new_man = pickle.load(man_file)
except IOError as err:
print('File error:'+str(err))
except pickle.PickleError as perr:
print('Pickling error:'+str(perr)) >>> import nester01
>>> nester01.print_lol(new_man)
Is this the right room for an argument?
No you haven't!
When?
No you didn't!
You didn't!
You did not!
Ah! (taking out his wallet and paying) Just the five minutes.
You most certainly did not!
Oh no you didn't!
Oh no you didn't!
Oh look, this isn't an argument!
No it isn't!
It's just contradiction!
It IS!
You just contradicted me!
You DID!
You did just then!
(exasperated) Oh, this is futile!!
Yes it is!
最后,显示数据的第一行和最后一行:
>>> print(new_man[0]) #显示第一行
Is this the right room for an argument?
>>> print(new_man[-1]) #显示最后一行
Yes it is!
总结
使用Pickle的通用文件I/O才是上策!嘿嘿~
让Python去负责文件I/O的细节,这样把关注重点放在代码的实际作用;
利用Python处理、保存和恢复列表中的数据,现在已经有一套可行、可靠的机制,
本章主要用到的方法有:
strip():可以从字符串中去除不想要的空白符;
print():BIF的参数控制,将数据发送、保存到相应地址;
finally:最终会执行的语句;
except:会传入一个异常对象并通过as赋值到一个标识符;
str():BIF可以用来访问任何数据对象的串表示;
locals():返回当前作用域的变量集合;
in:操作符用于检查成员关系;
+:连接两个字符串或两个数字相加;
with:自动处理已有打开文件的关闭工作,即使出现异常也会执行;
sys.stdout:Python中的标准输出,需要加载sys模块;
pickle模块:高效的将Python数据对象保存到磁盘(二进制)及从磁盘恢复,包括dump()保存和load()恢复。
-------------------------------------------The End of Fourth Chapter-------------------------------------------