[Python 从入门到放弃] 6. 文件与异常（二）

2022-06-14 09:25:25

本章所用test.txt文件可以在（ [Python 从入门到放弃] 6. 文件与异常（一））找到并自行创建

现在有个需求，对test.txt中的文本内容进行修改：

（1）将期间的‘：’改为‘ said:’

（2）将修改后的内容以覆盖的形式重新写入到该文件

1.步骤分析：

在（ [Python 从入门到放弃] 6. 文件与异常（一））我们提到

在进行文件操作时，需要逐行处理，可以使用迭代器

因为test.txt的内容是：

Man:Is this the right room for an argument?
Other Man:I've told once.
Man:No you haven't!
Other Man:Yes I have.
Man:When?
Other Man:Just now.
Man:No you didn't
......
Other Man:Now let's get one thing quite clear:I most definitely told you!
Man:Oh no you did't!
Other Man:Oh yes I did!

因为几乎每一行都带有 ‘：’ 需要被处理

所以采用迭代器符合条件需要

现在来看初步代码

ver 1：

the_file=open('f://test.txt',encoding='utf-8')

for line in the_file:

    (role,spoken)=line.split(':')
    print(role,' said:',spoken)

执行结果：
Man  said: Is this the right room for an argument?
Traceback (most recent call last):　　　　#抛出异常

Other Man  said: I've told once.
  File "C:/Users/L/PycharmProjects/untitled3/1.py", line 6, in <module>

Man  said: No you haven't!

Other Man  said: Yes I have.

    (role,spoken)=line.split(':')
Man  said: When?

ValueError: not enough values to unpack (expected 2, got 1)
Other Man  said: Just now.

Man  said: No you didn't

程序运行时抛出异常，表明出错

我们来分析一下为什么会抛出异常

先来对比一下结果处理后输出的字符串和原本的：

Man  said: No you didn't

'''
1.Man  said: No you didn't 这段以及之前的语句都成功处理
2.该字符串后面一行是 ’...........‘ 也就是在处理到这一行时出错
'''

在代码中：

(role,spoken)=line.split(':')的意思时，将字符串line进行分割分割的标记时split()里面的’：‘split（）会在line中找出所有冒号，将其切割比如：

原本文件中这一行：Man : No you haven't!
当line读取到这一行时
split（':'）会在Man : No you haven't!这个字符串里，从':'的地方开始’切割‘，分成两段
一段为：Man
另一端为：No you haven't!
所以
(role,spoken)=line.split(':')
的意思是：
将前一段赋值给role
后一段赋值给spoken
就这样一行行读下去
当读到文件中
'........'这一行时
由于里面没有':'冒号，split（）不知道该从那里切割
所以抛出异常报错

如果当这一行没有’：‘时，说明没有人说话，因此不需要加 ’ said：‘也不需要使用split（），因此就不会报错

ver 2：

the_file=open('f://test.txt',encoding='utf-8')

for line in the_file:
    if not line.find(':')==-1:
        (role,spoken)=line.split(':')
        print(role,' said:',spoken)
    else:
        print(line)

执行结果：
Traceback (most recent call last):
  File "C:/Users/L/PycharmProjects/untitled3/1.py", line 5, in <module>
    (role,spoken)=line.split(':')
ValueError: too many values to unpack (expected 2)
Man  said: Is this the right room for an argument?

Other Man  said: I've told once.

Man  said: No you haven't!

Other Man  said: Yes I have.

Man  said: When?

Other Man  said: Just now.

Man  said: No you didn't

......

'''
1. '.........'终于能正常输出了 这说明在遇到没有冒号的语句时 不会执行split（）
2. 使用了find（）作为判断条件，先将字符串line调用find（）查找冒号，如果冒号不存在，则跳过处理语句，如果冒号存在，则正常处理
3.find（）用于查找字符串是否包含某字符，不包含则返回-1
4.虽然避免了split（）因没有冒号而引发异常，当似乎又出现了新的异常

'''

新的异常：ValueError: too many values to unpack (expected 2)

我们读一下原文件中的数据：

文件剩余数据：
......
Other Man:Now let's get one thing quite clear:I most definitely told you!
Man:Oh no you did't!
Other Man:Oh yes I did!

除了'.........'已经能够正常执行处理之外，
其它行还没有执行
我们看一下它的下一行：
’ Other Man:Now let's get one thing quite clear:I most definitely told you! ‘
这一行里面包含了两个冒号
所以split（’：‘）会将其’切割‘成三段
而我们的赋值语句：
(role,spoken)=line.split(':')
只能将前两段赋值给两个变量
剩余一段没有变量与之匹配，一脸懵逼，无所适从，只能报错，抛出异常

到了这里，我们可以发现：处理文本信息时，由于文本信息的特殊性，它并不能按照理想的形式排列，

本文的代码中，以冒号为标记，可以将讲述人与其言语切割成两段，并进行处理

然而，并不代表每一行数据都只存在一个冒号

有时为0个，2个，或者3个等

在ver1中我们遇到了没有冒号的形式

在ver2中采用规避的方法，事先调用find（）函数，企图将没有冒号的数据行跳过split（）语句

现在，遇到了冒号>1的情况，因此抛出异常

处理方法：

让split（）只处理头一个冒号，将字符串’分割‘成两段，其余不管有多少个，都忽略：

 (role,spoken)=line.split(':',1) # 这个额外的参数控制‘split（）’如何分解

ver 3：

the_file=open('f://test.txt',encoding='utf-8')

for line in the_file:
    if not line.find(':')==-1:
        (role,spoken)=line.split(':',1)
        print(role,' said:',spoken)
    else:
        print(line)

运行结果：　　#　　 正常

Man  said: Is this the right room for an argument?

Other Man  said: I've told once.

Man  said: No you haven't!

Other Man  said: Yes I have.

Man  said: When?

Other Man  said: Just now.

Man  said: No you didn't

......

Other Man  said: Now let's get one thing quite clear:I most definitely told you!

Man  said: Oh no you did't!

Other Man  said: Oh yes I did!

本例中，由于数据的特殊性，导致频出异常

虽然每次根据具体情况修改代码，但是难以确保之后不会出现数据的其它特殊情况

而导致异常

因此，需要有一个良好的异常机制来预防和处理随时都有可能发送的异常

本章结合修改文本数据后重新写回文件之要求，重新修改代码，完整代码如下：

the_file=open('f://test.txt','r',encoding='utf-8')

text=""
for line in the_file:
    if not line.find(':')==-1:
        (role,spoken)=line.split(':',1)
        str=role+' said:'+spoken

    else:
        str=line
    str+='\n'
    text+=str

the_file.close()

the_file=open('f://test.txt','w',encoding='utf-8')
the_file.write(text)
the_file.close()

：

如果数据文件的格式发生改变，这个代码会有问题，相应地也需要改变条件

if语句使用的条件有点不好读，也不好理解

这个代码有点‘脆弱’....如果再出现另一个异常情况，它就会有问题

码农公寓

相关文章