from keras.preprocessing import sequence 报错解决
问题描述
自然语言处理学习过程中,在文本长度规范代码测试时,发生报错:
from keras.preprocessing import sequence
# cutlen根据数据分析中句子长度分布,覆盖90%左右语料的最短长度.
# 这里假定cutlen为10
cutlen = 10
def padding(x_train):
"""
description: 对输入文本张量进行长度规范
:param x_train: 文本的张量表示, 形如: [[1, 32, 32, 61], [2, 54, 21, 7, 19]]
:return: 进行截断补齐后的文本张量表示
"""
# 使用sequence.pad_sequences即可完成
return sequences(x_train, cutlen)
# 假定x_train里面有两条文本, 一条长度大于10, 一天小于10
x_train = [[1, 23, 5, 32, 55, 63, 2, 21, 78, 32, 23, 1],
[2, 32, 1, 23, 1]]
res = padding(x_train)
print(res)
2021-10-03 19:35:29.266282: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2021-10-03 19:35:29.266423: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "D:/Python/PycharmProjects/PyTorch/textPreprocessing/10.文本长度规范.py", line 2, in <module>
from keras.preprocessing import sequence
File "D:\Anaconda3\envs\PyTorch\lib\site-packages\keras\__init__.py", line 20, in <module>
from . import initializers
File "D:\Anaconda3\envs\PyTorch\lib\site-packages\keras\initializers\__init__.py", line 124, in <module>
populate_deserializable_objects()
File "D:\Anaconda3\envs\PyTorch\lib\site-packages\keras\initializers\__init__.py", line 49, in populate_deserializable_objects
LOCAL.GENERATED_WITH_V2 = tf.__internal__.tf2.enabled()
AttributeError: module 'tensorflow.compat.v2' has no attribute '__internal__'
AttributeError: module 'tensorflow.compat.v2' has no attribute '__internal__'
搜索后感觉是Tensorflow和Keras版本不对应的问题,重新安装仍然报错:
pip install tensorflow==2.1
pip install keras==2.3.1
百度搜索后,再次尝试!
之后修改导入代码:
from tensorflow.keras.preprocessing import sequence
再次报错:
2021-10-03 19:37:39.192743: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2021-10-03 19:37:39.192896: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "D:/Python/PycharmProjects/PyTorch/textPreprocessing/10.文本长度规范.py", line 25, in <module>
res = padding(x_train)
File "D:/Python/PycharmProjects/PyTorch/textPreprocessing/10.文本长度规范.py", line 18, in padding
return sequences(x_train, cutlen)
NameError: name 'sequences' is not defined
NameError: name 'sequences' is not defined
没有该模块!
解决方法
之后经查阅发现有替代的模块使用:
from tensorflow.keras.preprocessing.sequence import pad_sequences
修改后的代码:
from tensorflow.keras.preprocessing.sequence import pad_sequences
# 忽略警告
import os
os.environ["TF_KERAS"] = '2'
# cutlen根据数据分析中句子长度分布,覆盖90%左右语料的最短长度.
# 这里假定cutlen为10
cutlen = 10
def padding(x_train):
"""
description: 对输入文本张量进行长度规范
:param x_train: 文本的张量表示, 形如: [[1, 32, 32, 61], [2, 54, 21, 7, 19]]
:return: 进行截断补齐后的文本张量表示
"""
# 使用sequence.pad_sequences即可完成
return pad_sequences(x_train, cutlen)
# 假定x_train里面有两条文本, 一条长度大于10, 一天小于10
x_train = [[1, 23, 5, 32, 55, 63, 2, 21, 78, 32, 23, 1],
[2, 32, 1, 23, 1]]
res = padding(x_train)
print(res)
[[ 5 32 55 63 2 21 78 32 23 1]
[ 0 0 0 0 0 2 32 1 23 1]]
即可解决!
参考:https://www.cnblogs.com/qizhou/p/13179099.html
加油!
感谢!
努力!