译自:https://docs.python.org/2/library/index.html业余时间翻译,有时间有心情有思路有冲动就翻译,部分翻译为意译或替换为更容易理解的意思,水平特有限,仅供自己参考
格式有时间再调
7. 字符串服务
- 7.1. string — 通用字符串操作
- 7.2. re — 正则表达式操作
- 7.3. struct — 二进制字符串操作
- 7.4. difflib — 计算序列?
- 7.5. StringIO — 以文件形式读取字符串
- 7.6. cStringIO — StringIO更快一些的版本
- 7.7. textwrap — 文本包装和过滤
- 7.8. codecs — 编解码器注册和基本类
- 7.9. unicodedata — Unicode库相关
- 7.10. stringprep — 网络字符串准备?
- 7.11. fpformat — 浮点数转换
7.1. string— 通用字符串操作
源代码: Lib/string.py
7.1.1. String常量
- string.ascii_letters
-
大小写ascii字母常量,非本地依赖
- string.ascii_lowercase
-
小写字母‘abcdefghijklmnopqrstuvwxyz‘,非本地依赖且不会改变。
- string.ascii_uppercase
-
大写字母‘ABCDEFGHIJKLMNOPQRSTUVWXYZ‘,非本地依赖且不会改变
- string.digits
-
字符串‘0123456789‘.
- string.hexdigits
-
字符串‘0123456789abcdefABCDEF‘.
- string.letters
-
大小写字母,本地依赖,执行locale.setlocale()方法时会更新。
- string.lowercase
-
包含所有小写字母的字符串。在大多数系统中,该字符串为‘abcdefghijklmnopqrstuvwxyz‘。本地依赖,执行locale.setlocale()方法时会更新
- string.octdigits
-
字符串‘01234567‘.
- string.punctuation
-
ASCII字符中,在C locale被认为是标点符号的字符组成的字符串
- string.printable
-
可打印字符,由digits(数字)、letters(字母)、punctuation(标点符号)和whitespace(空格符)。
- string.uppercase
-
包含所有大写字母的字符串。在大多数系统中,该字符串‘ABCDEFGHIJKLMNOPQRSTUVWXYZ‘。本地依赖,执行locale.setlocale()方法时会更新。
- string.whitespace
-
空格符字符串。在多数系统中,该字符串包含空格、制表符、换行符、回车符、换页符、垂直制表符。
7.1.2. String格式化
注:2.6版本新特性
内置的str和unicode类通过str.format()方法(PEP
3101)提供了复杂变量替换和值格式化的能力。string模块中的Formatter类允许使用类似于内置的format()方法的实现来创建和定制字符串格式化行为
- class string.Formatter
-
public方法:
- vformat(format_string, args, kwargs)
-
对格式化的实际操作。作为单独的函数实现,方便传入预定义参数字典,而不是作为单一参数使用*args和**kwargs语义传入未包装和再包装的字典?。vformat() 分拆格式化字符串为字符数据,并替换相应的域。它将会调用下面介绍的各种方法。
另外,Formatter类定义了一系列用来让子类替换(重新实现)的方法:
- parse(format_string)
-
Loop over the format_string and return an iterable of tuples (literal_text, field_name, format_spec, conversion). This is used by vformat() to break the string into either literal text, or replacement fields.
The values in the tuple conceptually represent a span of literal text followed by a single replacement field. If there is no literal text (which can happen if two replacement fields occur consecutively), then literal_text will be a zero-length string. If there is no replacement field, then the values of field_name, format_spec and conversion will be None.
- get_field(field_name, args, kwargs)
-
Given field_name as returned by parse() (see above), convert it to an object to be formatted. Returns a tuple (obj, used_key). The default version takes strings of the form defined in PEP 3101, such as “0[name]” or “label.title”. args and kwargs are as passed in to vformat(). The return value used_key has the same meaning as the key parameter to get_value().
- get_value(key, args, kwargs)
-
Retrieve a given field value. The key argument will be either an integer or a string. If it is an integer, it represents the index of the positional argument in args; if it is a string, then it represents a named argument in kwargs.
The args parameter is set to the list of positional arguments to vformat(), and the kwargs parameter is set to the dictionary of keyword arguments.
For compound field names, these functions are only called for the first component of the field name; Subsequent components are handled through normal attribute and indexing operations.
So for example, the field expression ‘0.name’ would cause get_value() to be called with a key argument of 0. The name attribute will be looked up after get_value() returns by calling the built-in getattr() function.
If the index or keyword refers to an item that does not exist, then an IndexError or KeyError should be raised.
- check_unused_args(used_args, args, kwargs)
-
Implement checking for unused arguments if desired. The arguments to this function is the set of all argument keys that were actually referred to in the format string (integers for positional arguments, and strings for named arguments), and a reference to the args and kwargs that was passed to vformat. The set of unused args can be calculated from these parameters. check_unused_args() is assumed to raise an exception if the check fails.
- format_field(value, format_spec)
-
format_field() simply calls the global format() built-in. The method is provided so that subclasses can override it.
- convert_field(value, conversion)
-
Converts the value (returned by get_field()) given a conversion type (as in the tuple returned by the parse() method). The default version understands ‘s’ (str), ‘r’ (repr) and ‘a’ (ascii) conversion types.