项目说明
在 Tensorflow2.0之tf.keras.applacations迁移学习 一文中,我们演示了如何将迁移学习层和自定义的分类层结合起来使用,但这里有个问题,就是当你再次打印结合后的模型的每层的名称时,会出现如下情况:
import tensorflow as tf
mobile = tf.keras.applications.MobileNet(include_top=False, weights='imagenet', input_shape=(224, 224, 3))
mobile.trainable = False
# 结合自定义分类层
model = tf.keras.Sequential([
mobile,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(3, activation='softmax')
])
for layer in model.layers:
print(layer.name)
mobilenet_1.00_224
global_average_pooling2d
dropout
dense
但是在打印 mobile 的层的名称时,得到的结果是:
for layer in mobile.layers:
print(layer.name)
input_1
conv1_pad
conv1
conv1_bn
conv1_relu
conv_dw_1
conv_dw_1_bn
conv_dw_1_relu
conv_pw_1
conv_pw_1_bn
conv_pw_1_relu
conv_pad_2
conv_dw_2
conv_dw_2_bn
conv_dw_2_relu
conv_pw_2
conv_pw_2_bn
conv_pw_2_relu
conv_dw_3
conv_dw_3_bn
conv_dw_3_relu
conv_pw_3
conv_pw_3_bn
conv_pw_3_relu
conv_pad_4
conv_dw_4
conv_dw_4_bn
conv_dw_4_relu
conv_pw_4
conv_pw_4_bn
conv_pw_4_relu
conv_dw_5
conv_dw_5_bn
conv_dw_5_relu
conv_pw_5
conv_pw_5_bn
conv_pw_5_relu
conv_pad_6
conv_dw_6
conv_dw_6_bn
conv_dw_6_relu
conv_pw_6
conv_pw_6_bn
conv_pw_6_relu
conv_dw_7
conv_dw_7_bn
conv_dw_7_relu
conv_pw_7
conv_pw_7_bn
conv_pw_7_relu
conv_dw_8
conv_dw_8_bn
conv_dw_8_relu
conv_pw_8
conv_pw_8_bn
conv_pw_8_relu
conv_dw_9
conv_dw_9_bn
conv_dw_9_relu
conv_pw_9
conv_pw_9_bn
conv_pw_9_relu
conv_dw_10
conv_dw_10_bn
conv_dw_10_relu
conv_pw_10
conv_pw_10_bn
conv_pw_10_relu
conv_dw_11
conv_dw_11_bn
conv_dw_11_relu
conv_pw_11
conv_pw_11_bn
conv_pw_11_relu
conv_pad_12
conv_dw_12
conv_dw_12_bn
conv_dw_12_relu
conv_pw_12
conv_pw_12_bn
conv_pw_12_relu
conv_dw_13
conv_dw_13_bn
conv_dw_13_relu
conv_pw_13
conv_pw_13_bn
conv_pw_13_relu
可见在用 tf.keras.Sequential 将迁移学习层和分类层结合起来之后,默认将迁移学习层视为一个层,这对 结合模型中层的提取来说带来了很大的困难,因为我们希望的是在打印结合模型中的所有层名时得到:
input_1
conv1_pad
conv1
conv1_bn
conv1_relu
conv_dw_1
conv_dw_1_bn
conv_dw_1_relu
conv_pw_1
conv_pw_1_bn
conv_pw_1_relu
conv_pad_2
conv_dw_2
conv_dw_2_bn
conv_dw_2_relu
conv_pw_2
conv_pw_2_bn
conv_pw_2_relu
conv_dw_3
conv_dw_3_bn
conv_dw_3_relu
conv_pw_3
conv_pw_3_bn
conv_pw_3_relu
conv_pad_4
conv_dw_4
conv_dw_4_bn
conv_dw_4_relu
conv_pw_4
conv_pw_4_bn
conv_pw_4_relu
conv_dw_5
conv_dw_5_bn
conv_dw_5_relu
conv_pw_5
conv_pw_5_bn
conv_pw_5_relu
conv_pad_6
conv_dw_6
conv_dw_6_bn
conv_dw_6_relu
conv_pw_6
conv_pw_6_bn
conv_pw_6_relu
conv_dw_7
conv_dw_7_bn
conv_dw_7_relu
conv_pw_7
conv_pw_7_bn
conv_pw_7_relu
conv_dw_8
conv_dw_8_bn
conv_dw_8_relu
conv_pw_8
conv_pw_8_bn
conv_pw_8_relu
conv_dw_9
conv_dw_9_bn
conv_dw_9_relu
conv_pw_9
conv_pw_9_bn
conv_pw_9_relu
conv_dw_10
conv_dw_10_bn
conv_dw_10_relu
conv_pw_10
conv_pw_10_bn
conv_pw_10_relu
conv_dw_11
conv_dw_11_bn
conv_dw_11_relu
conv_pw_11
conv_pw_11_bn
conv_pw_11_relu
conv_pad_12
conv_dw_12
conv_dw_12_bn
conv_dw_12_relu
conv_pw_12
conv_pw_12_bn
conv_pw_12_relu
conv_dw_13
conv_dw_13_bn
conv_dw_13_relu
conv_pw_13
conv_pw_13_bn
conv_pw_13_relu
global_average_pooling2d
dropout
dense
解决方法
我们希望通过直接对 Tensorflow2.0 中迁移学习模型的源文件进行修改来解决这一问题,这种思路的难点如下:
- 1、如何将我们自定义的分类层代替源文件中的分类层;
- 2、如果在 tf.keras.applacations 中选择保留分类层,便不能导入 imagenet 权重,因为这个分类层是我们自定义的,其中权重的个数和源文件中分类层中权重的个数不同;
- 3、如何只将 imagenet 权重导入非分类层而不导入分类层。
下面我们将一一解决。
1、分类层替换
这个问题比较好解决,只需要将源文件中的分类层进行替换即可。
首先,我们需要找到源文件在电脑中的存放位置,如果是用的 Anaconda,该文件一般存放在 Anaconda3\envs\tensorflow2\Lib\site-packages\keras_applications 路径下。
然后,我们需要找到源文件中定义分类层的位置,如下图所示,在第 243 行。
然后,用我们自定义的分类层代替源文件中的分类层:
2、部分导入权重
在源文件中,我们发现在第 161 行有一个条件语句:
这表示如果要保留分类层的同时导入 imagenet 权重是会报错的,所以我们将这一条注释掉。
然后,我们在第 270 行到第 300 行了解到了建立模型以及导入权重的过程:
虽然我们选择了保留分类层,但我们需要的权重文件是不含分类层权重的,否则会导致权重个数不匹配的错误,因此,我们要将这一段改为:
其中 model_wi_weights 是非分类层,也是我们需要导入 imagenet 权重的层,而最后定义的 model 才是我们最终需要的迁移模型。
之后,保存文件,重新使用 tf.keras.applacations 迁移该模型,此时我们选择 include_top=True, weights=‘imagenet’ 来迁移模型,然后我们查看所有层的名称,得到:
mobile = tf.keras.applications.MobileNet(include_top=True, weights='imagenet', input_shape=(224, 224, 3))
input_1
conv1_pad
conv1
conv1_bn
conv1_relu
conv_dw_1
conv_dw_1_bn
conv_dw_1_relu
conv_pw_1
conv_pw_1_bn
conv_pw_1_relu
conv_pad_2
conv_dw_2
conv_dw_2_bn
conv_dw_2_relu
conv_pw_2
conv_pw_2_bn
conv_pw_2_relu
conv_dw_3
conv_dw_3_bn
conv_dw_3_relu
conv_pw_3
conv_pw_3_bn
conv_pw_3_relu
conv_pad_4
conv_dw_4
conv_dw_4_bn
conv_dw_4_relu
conv_pw_4
conv_pw_4_bn
conv_pw_4_relu
conv_dw_5
conv_dw_5_bn
conv_dw_5_relu
conv_pw_5
conv_pw_5_bn
conv_pw_5_relu
conv_pad_6
conv_dw_6
conv_dw_6_bn
conv_dw_6_relu
conv_pw_6
conv_pw_6_bn
conv_pw_6_relu
conv_dw_7
conv_dw_7_bn
conv_dw_7_relu
conv_pw_7
conv_pw_7_bn
conv_pw_7_relu
conv_dw_8
conv_dw_8_bn
conv_dw_8_relu
conv_pw_8
conv_pw_8_bn
conv_pw_8_relu
conv_dw_9
conv_dw_9_bn
conv_dw_9_relu
conv_pw_9
conv_pw_9_bn
conv_pw_9_relu
conv_dw_10
conv_dw_10_bn
conv_dw_10_relu
conv_pw_10
conv_pw_10_bn
conv_pw_10_relu
conv_dw_11
conv_dw_11_bn
conv_dw_11_relu
conv_pw_11
conv_pw_11_bn
conv_pw_11_relu
conv_pad_12
conv_dw_12
conv_dw_12_bn
conv_dw_12_relu
conv_pw_12
conv_pw_12_bn
conv_pw_12_relu
conv_dw_13
conv_dw_13_bn
conv_dw_13_relu
conv_pw_13
conv_pw_13_bn
conv_pw_13_relu
global_average_pooling2d
dropout
dense