fileinput
模块允许你循环一个或多个文本文件的内容
使用 fileinput 模块循环一个文本文件
import fileinput
import sys
for line in fileinput. input ( "samples/sample.txt" ):
sys.stdout.write( "-> " )
sys.stdout.write(line)
- > We will perhaps eventually be writing only small
- > modules which are identified by name as they are
- > used to build larger ones, so that devices like
- > indentation, rather than delimiters, might become
- > feasible for expressing local structure in the
- > source language.
- > - - Donald E. Knuth, December 1974
|
你也可以使用 fileinput
模块获得当前行的元信息 (meta information). 其中包括 isfirstline
, filename
, lineno
使用 fileinput 模块处理多个文本文件
import fileinput
import glob
import string, sys
for line in fileinput. input (glob.glob( "samples/*.txt" )):
if fileinput.isfirstline(): # first in a file?
sys.stderr.write( "-- reading %s --\n" % fileinput.filename())
sys.stdout.write( str (fileinput.lineno()) + " " + string.upper(line))
- - reading samples\sample.txt - -
1 WE WILL PERHAPS EVENTUALLY BE WRITING ONLY SMALL
2 MODULES WHICH ARE IDENTIFIED BY NAME AS THEY ARE
3 USED TO BUILD LARGER ONES, SO THAT DEVICES LIKE
4 INDENTATION, RATHER THAN DELIMITERS, MIGHT BECOME
5 FEASIBLE FOR EXPRESSING LOCAL STRUCTURE IN THE
6 SOURCE LANGUAGE.
7 - - DONALD E. KNUTH, DECEMBER 1974
|
文本文件的替换操作很简单. 只需要把 inplace
关键字参数设置为 1 , 传递给 input
函数, 该模块会帮你做好一切.
使用 fileinput 模块将 CRLF 改为 LF
import fileinput, sys
for line in fileinput. input (inplace = 1 ):
# convert Windows/DOS text files to Unix files
if line[ - 2 :] = = "\r\n" :
line = line[: - 2 ] + "\n"
sys.stdout.write(line)
|
shutil
实用模块包含了一些用于复制文件和文件夹的函数.
使用 shutil 复制文件
import shutil
import os
for file in os.listdir( "." ):
if os.path.splitext( file )[ 1 ] = = ".py" :
print file
shutil.copy( file , os.path.join( "backup" , file ))
aifc - example - 1.py
anydbm - example - 1.py
array - example - 1.py
... |
copytree
函数用于复制整个目录树 (与 cp -r
相同), 而 rmtree
函数用于删除整个目录树 (与 rm -r
)
使用 shutil 模块复制/删除目录树
import shutil
import os
SOURCE = "samples"
BACKUP = "samples-bak"
# create a backup directory shutil.copytree(SOURCE, BACKUP) print os.listdir(BACKUP)
# remove it shutil.rmtree(BACKUP) print os.listdir(BACKUP)
[ 'sample.wav' , 'sample.jpg' , 'sample.au' , 'sample.msg' , 'sample.tgz' ,
... Traceback (most recent call last): File "shutil-example-2.py" , line 17 , in ?
print os.listdir(BACKUP)
os.error: No such file or directory
|
tempfile
模块允许你快速地创建名称唯一的临时文件供使用.
使用 tempfile 模块创建临时文件
import tempfile
import os
tempfile = tempfile.mktemp()
print "tempfile" , "=>" , tempfile
file = open (tempfile, "w+b" )
file .write( "*" * 1000 )
file .seek( 0 )
print len ( file .read()), "bytes"
file .close()
try :
# must remove file when done
os.remove(tempfile)
except OSError:
pass
tempfile = > C:\TEMP\~ 160 - 1
1000 bytes
|
TemporaryFile
函数会自动挑选合适的文件名, 并打开文件而且它会确保该文件在关闭的时候会被删除. (在 Unix 下, 你可以删除一个已打开的文件, 这 时文件关闭时它会被自动删除. 在其他平台上, 这通过一个特殊的封装类实现.)
使用 tempfile 模块打开临时文件
import tempfile
file = tempfile.TemporaryFile()
for i in range ( 100 ):
file .write( "*" * 100 )
file .close() # removes the file!
|
StringIO
模块的使用. 它实现了一个工作在内存的文件对象 (内存文件). 在大多需要标准文件对象的地方都可以使用它来替换.
使用 StringIO 模块从内存文件读入内容
import StringIO
MESSAGE = "That man is depriving a village somewhere of a computer scientist."
file = StringIO.StringIO(MESSAGE)
print file .read()
That man is depriving a village somewhere of a computer scientist.
|
StringIO
类实现了内建文件对象的所有方法, 此外还有 getvalue
方法用来返回它内部的字符串值
使用 StringIO 模块向内存文件写入内容
import StringIO
file = StringIO.StringIO()
file .write( "This man is no ordinary man. " )
file .write( "This is Mr. F. G. Superman." )
print file .getvalue()
This man is no ordinary man. This is Mr. F. G. Superman.
|
使用 StringIO 模块捕获输出
import StringIO
import string, sys
stdout = sys.stdout
sys.stdout = file = StringIO.StringIO()
print """
According to Gbaya folktales, trickery and guile are the best ways to defeat the python, king of snakes, which was hatched from a dragon at the world's start. -- National Geographic, May 1997 """ sys.stdout = stdout
print string.upper( file .getvalue())
ACCORDING TO GBAYA FOLKTALES, TRICKERY AND GUILE ARE THE BEST WAYS TO DEFEAT THE PYTHON, KING OF SNAKES, WHICH WAS HATCHED FROM A DRAGON AT THE WORLD'S START. - - NATIONAL GEOGRAPHIC, MAY 1997
|
cStringIO
是一个可选的模块, 是 StringIO
的更快速实现. 它的工作方式和 StringIO
基本相同, 但是它不可以被继承
使用 cStringIO 模块
import cStringIO
MESSAGE = "That man is depriving a village somewhere of a computer scientist."
file = cStringIO.StringIO(MESSAGE)
print file .read()
That man is depriving a village somewhere of a computer scientist.
|
为了让你的代码尽可能快, 但同时保证兼容低版本的 Python ,你可以使用一个小技巧在 cStringIO
不可用时启用 StringIO
模块,
后退至 StringIO
try :
import cStringIO
StringIO = cStringIO
except ImportError:
import StringIO
print StringIO
<module 'StringIO' (built - in )>
|
mmap
模块提供了操作系统内存映射函数的接口, 映射区域的行为和字符串对象类似, 但数据是直接从文件读取的.
使用 mmap 模块
import mmap
import os
filename = "samples/sample.txt"
file = open (filename, "r+" )
size = os.path.getsize(filename)
data = mmap.mmap( file .fileno(), size)
# basics print data
print len (data), size
# use slicing to read from the file # 使用切片操作读取文件 print repr (data[: 10 ]), repr (data[: 10 ])
# or use the standard file interface # 或使用标准的文件接口 print repr (data.read( 10 )), repr (data.read( 10 ))
<mmap object at 008A2A10 >
302 302
'We will pe' 'We will pe'
'We will pe' 'rhaps even'
|
在 Windows 下, 这个文件必须以既可读又可写的模式打开( `r+` , `w+` , 或 `a+` ), 否则 mmap
调用会失败.
对映射区域使用字符串方法和正则表达式
mport mmap import os, string, re
def mapfile(filename):
file = open (filename, "r+" )
size = os.path.getsize(filename)
return mmap.mmap( file .fileno(), size)
data = mapfile( "samples/sample.txt" )
# search index = data.find( "small" )
print index, repr (data[index - 5 :index + 15 ])
# regular expressions work too! m = re.search( "small" , data)
print m.start(), m.group()
43 'only small\015\012modules '
43 small
|
==============================================================================