一.几个知识点
1.卷积神经网络和人工神经网络的区别
传统意义上的人工神经网络只有输入层、隐藏层、输出层,其中隐藏层的层数根据需要而定,构建的步骤是:特征映射到值,特征是人工挑选。卷积神经网络在原来多层神经网络的基础上,加入了特征学习部分,就是在原来全连接层前面加入了部分连接的卷积层和池化层,构建的步骤是:信号->特征->值,特征是由网络自己选择的。
2.卷积神经网络的基本组成
卷积层:卷积就是图片中的局部数据和卷积核的内积,它是为了提取图片局部的特征。卷积核(kernel)又称过滤器(filter),其实就是一组带有固定的权重的神经元,用来提取特定的特征,提取的特征一般称为feature map(特征映射)。卷积层是由多个filter的叠加形成的。
每个filter有一组权重,一个filter滑动到一个位置后计算卷积,求和,最后加上加bias(偏置),得到这个filter在该位置的最终结果。
激活函数:上一节笔记已经讲过了,卷积层后有激活函数,通过激活函数后就是这一层的输出。
池化层:池化层可以有效的缩小参数矩阵的尺寸,从而减少最后连接层的中的参数数量。常见的池化方法有:平均池化(average pooling)——计算图像区域的平均值作为该区域池化后的值。最大池化(max pooling)——选图像区域的最大值作为该区域池化后的值。池化层后也有激活函数。
卷积层和池化层的个数根据需要来定。
全连接层:全连接层在卷积神经网络尾部,与人工神经网络的连接方式是一样的,起到“分类器”的作用。
3.带Momentum参数的随机梯度下降(Stochastic Gradient Descent,SGD)优化方法
二.使用tensorflow和keras搭建卷积神经网络框架
代码如下:
#Cnn_HandWritten.py
from numpy import mean
from numpy import std
from matplotlib import pyplot
from sklearn.model_selection import KFold
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import utils
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Flatten
from tensorflow.keras.optimizers import SGD
#load tarin & test dataset
def load_dataset():
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
#reshape dataset to have a single channel
X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))
X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))
#one hot representation
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
return X_train, y_train, X_test, y_test
#scale pixels
def pre_pixels(train, test):
#convert from integers to floats
train_norm = train.astype('float32')
test_norm = test.astype('float32')
# normalize inputs from 0-255 to 0-1
train_norm = train_norm / 255.0
test_norm = test_norm / 255.0
return train_norm, test_norm
#define a model
def define_model():
model = Sequential()
#add a convolutional layer
model.add(Conv2D(8, (3,3), activation = 'relu', kernel_initializer = 'he_uniform', input_shape = (28,28,1)))
#add a pooling layer
model.add(MaxPooling2D((2,2)))
#the number of output for each layer = (input - kernel + 2 * padding) / stride + 1
model.add(Flatten())
#add a hidden layer
model.add(Dense(120, activation = 'relu', kernel_initializer = 'he_uniform'))
#add a output layer
model.add(Dense(10, activation = 'softmax'))
#comoile the model
opt = SGD(lr = 0.01, momentum = 0.9)
model.compile(optimizer = opt, loss = 'categorical_crossentropy', metrics = ['accuracy'])
print(model.summary())
return model
#evaluate a model using k-fols cross-validation
def evaluate_model(dataX, dataY, n_folds = 5):
scores, histories = list(), list()
#prepare cross validation
kfold = KFold(n_folds, shuffle = True, random_state = 1)
for train_ix, test_ix in kfold.split(dataX):
model = define_model()
train_x, train_y, test_x, test_y = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
history = model.fit(train_x, train_y, epochs = 10, batch_size = 60, validation_data = (test_x, test_y), verbose = 0)
print(history.history.keys())
#evaluate model
_, acc = model.evaluate(test_x, test_y, verbose = 0)
print('> acc: %.3f' % (acc * 100.0))
#stores scores
scores.append(acc)
histories.append(history)
print('scores', scores)
print('histories,len', len(histories))
return scores, histories
'''
#plot diagnostic learning curves
def summarize_diagnostics(histories):
for i in range(len(histories)):
#plot loss
pyplot.subplot(2,1,1)
pyplot.title('Cross Entropy Loss')
pyplot.plot(histories[i].history['loss'], color = 'blue', label = 'train')
pyplot.plot(histories[i].history['val_loss'], color = 'orange', label = 'test')
pyplot.ylabel('loss')
pyplot.xlabel('epoch')
pyplot.legend(['train', 'test'], loc = 'upper right')
#plot accuracy
pyplot.subplot(2, 1, 2)
pyplot.title('Classification Accuracy')
pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')
pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')
pyplot.ylabel('accuracy')
pyplot.xlabel('epoch')
pyplot.legend(['train', 'test'], loc='upper right')
pyplot.show()
'''
#summarize performance of the model
def summarize_performance(scores):
#print summary
print('Accuracy: mean = %.3f std = %.3f, n=%d' % (mean(scores) * 100, std(scores) * 100, len(scores)))
#run the test harness for evaluating amodel
def run_mymodel_test():
#load dataset
X_train, y_train, X_test, y_test = load_dataset()
#scale pixels
X_train, X_test = pre_pixels(X_train, X_test)
#evaluate a model
scores, histories = evaluate_model(X_train, y_train)
# plot diagnostic learning curves
#summarize_diagnostics(histories)
# summarize performance of the model
summarize_performance(scores)
#主程序入口
run_mymodel_test()
运行结果:
三.遇到的问题及解决方法
一开始跑这个代码的时候报错如下:
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
网上提供的解决办法是,在程序的开头加入如下代码:
physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
tf.config.experimental.set_memory_growth(physical_devices[0], True)
当我加入之后,报错变成了:
ValueError: Memory growth cannot differ between GPU devices
网上提供的解决办法是,在程序的开头再加上如下代码:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
运行,上面报错消失了,它变回了一开始的那个错误……也不知道为什么我兜兜转转一圈然后..回到了最初的起点。
我突然意识到这样不行,毕竟可能每个人的版本啊配置啊可能不一样,我就去细看报错信息,果然,我发现了这么一行:
Loaded runtime CuDNN library: 7.1.4 but source was compiled with: 7.6.0. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
难道是Cudnn版本不兼容?我就准备查找升级Cudnn的办法,办法找到了,但是我没使用,因为我发现了这个:[cudnn报错解决]Loaded runtime CuDNN library: 7.0.5 but source was compiled with: 7.2.1.https://blog.csdn.net/jy1023408440/article/details/82887479幸好我没着急忙慌地Cudnn先升级了,万一也没解决那我一天也没了,既然主要途径就是降低Tensorflow的版本,那我直接用有Tensorflow1.12.0的虚拟环境,
Cudnn不升级,最后完美解决。