卷积神经网络学习笔记(以手写体数字识别为例)

一.几个知识点

1.卷积神经网络和人工神经网络的区别

       传统意义上的人工神经网络只有输入层、隐藏层、输出层,其中隐藏层的层数根据需要而定,构建的步骤是:特征映射到值,特征是人工挑选。卷积神经网络在原来多层神经网络的基础上,加入了特征学习部分,就是在原来全连接层前面加入了部分连接的卷积层和池化层,构建的步骤是:信号->特征->值,特征是由网络自己选择的。

2.卷积神经网络的基本组成

卷积层卷积就是图片中的局部数据和卷积核的内积,它是为了提取图片局部的特征。卷积核(kernel)又称过滤器(filter),其实就是一组带有固定的权重的神经元,用来提取特定的特征,提取的特征一般称为feature map(特征映射)。卷积层是由多个filter的叠加形成的。

每个filter有一组权重,一个filter滑动到一个位置后计算卷积,求和,最后加上加bias(偏置),得到这个filter在该位置的最终结果。

激活函数:上一节笔记已经讲过了,卷积层后有激活函数,通过激活函数后就是这一层的输出。

池化层:池化层可以有效的缩小参数矩阵的尺寸,从而减少最后连接层的中的参数数量。常见的池化方法有:平均池化(average pooling)——计算图像区域的平均值作为该区域池化后的值。最大池化(max pooling)——选图像区域的最大值作为该区域池化后的值。池化层后也有激活函数。

卷积层和池化层的个数根据需要来定。

全连接层:全连接层在卷积神经网络尾部,与人工神经网络的连接方式是一样的,起到“分类器”的作用。

3.带Momentum参数的随机梯度下降(Stochastic Gradient Descent,SGD)优化方法

二.使用tensorflow和keras搭建卷积神经网络框架

代码如下:

#Cnn_HandWritten.py

from numpy import mean
from numpy import std
from matplotlib import pyplot
from sklearn.model_selection import KFold
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import utils
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Flatten
from tensorflow.keras.optimizers import SGD

#load tarin & test dataset
def load_dataset():
    (X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
    #reshape dataset to have a single channel
    X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))
    X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))

    #one hot representation
    y_train = to_categorical(y_train)
    y_test = to_categorical(y_test)

    return X_train, y_train, X_test, y_test

#scale pixels
def pre_pixels(train, test):
    #convert from integers to floats
    train_norm = train.astype('float32')
    test_norm = test.astype('float32')

    # normalize inputs from 0-255 to 0-1
    train_norm = train_norm / 255.0
    test_norm = test_norm / 255.0

    return train_norm, test_norm

#define a model
def define_model():
    model = Sequential()
    #add a convolutional layer
    model.add(Conv2D(8, (3,3), activation = 'relu', kernel_initializer = 'he_uniform', input_shape = (28,28,1)))
    #add a pooling layer
    model.add(MaxPooling2D((2,2)))
    #the number of output for each layer = (input - kernel + 2 * padding) / stride + 1
    model.add(Flatten())
    #add a hidden layer
    model.add(Dense(120, activation = 'relu', kernel_initializer = 'he_uniform'))
    #add a output layer
    model.add(Dense(10, activation = 'softmax'))

    #comoile the model
    opt = SGD(lr = 0.01, momentum = 0.9)
    model.compile(optimizer = opt, loss = 'categorical_crossentropy', metrics = ['accuracy'])
    print(model.summary())
    return model

#evaluate a model using k-fols cross-validation
def evaluate_model(dataX, dataY, n_folds = 5):
    scores, histories = list(), list()
    #prepare cross validation
    kfold = KFold(n_folds, shuffle = True, random_state = 1)

    for train_ix, test_ix in kfold.split(dataX):
        model = define_model()
        train_x, train_y, test_x, test_y = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
        history = model.fit(train_x, train_y, epochs = 10, batch_size = 60, validation_data = (test_x, test_y), verbose = 0)
        print(history.history.keys())
        #evaluate model
        _, acc = model.evaluate(test_x, test_y, verbose = 0)
        print('> acc: %.3f' % (acc * 100.0))
        #stores scores
        scores.append(acc)
        histories.append(history)
    print('scores', scores)
    print('histories,len', len(histories))
    return scores, histories
'''
#plot diagnostic learning curves
def summarize_diagnostics(histories):
    for i in range(len(histories)):
        #plot loss
        pyplot.subplot(2,1,1)
        pyplot.title('Cross Entropy Loss')
        pyplot.plot(histories[i].history['loss'], color = 'blue', label = 'train')
        pyplot.plot(histories[i].history['val_loss'], color = 'orange', label = 'test')
        pyplot.ylabel('loss')
        pyplot.xlabel('epoch')
        pyplot.legend(['train', 'test'], loc = 'upper right')

        #plot accuracy
        pyplot.subplot(2, 1, 2)
        pyplot.title('Classification Accuracy')
        pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')
        pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')
        pyplot.ylabel('accuracy')
        pyplot.xlabel('epoch')
        pyplot.legend(['train', 'test'], loc='upper right')
    pyplot.show()
'''
#summarize performance of the model
def summarize_performance(scores):
    #print summary
    print('Accuracy: mean = %.3f std = %.3f, n=%d' % (mean(scores) * 100, std(scores) * 100, len(scores)))

#run the test harness for evaluating amodel
def run_mymodel_test():
    #load dataset
    X_train, y_train, X_test, y_test = load_dataset()
    #scale pixels
    X_train, X_test = pre_pixels(X_train, X_test)
    #evaluate a model
    scores, histories = evaluate_model(X_train, y_train)
    # plot diagnostic learning curves
    #summarize_diagnostics(histories)
    # summarize performance of the model
    summarize_performance(scores)

#主程序入口
run_mymodel_test()

运行结果:

卷积神经网络学习笔记(以手写体数字识别为例)

卷积神经网络学习笔记(以手写体数字识别为例)

卷积神经网络学习笔记(以手写体数字识别为例)

 三.遇到的问题及解决方法

一开始跑这个代码的时候报错如下:

tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

 网上提供的解决办法是,在程序的开头加入如下代码:

physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
tf.config.experimental.set_memory_growth(physical_devices[0], True)

当我加入之后,报错变成了:

ValueError: Memory growth cannot differ between GPU devices

网上提供的解决办法是,在程序的开头再加上如下代码:

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

运行,上面报错消失了,它变回了一开始的那个错误……也不知道为什么我兜兜转转一圈然后..回到了最初的起点。

我突然意识到这样不行,毕竟可能每个人的版本啊配置啊可能不一样,我就去细看报错信息,果然,我发现了这么一行:

Loaded runtime CuDNN library: 7.1.4 but source was compiled with: 7.6.0.  CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.

难道是Cudnn版本不兼容?我就准备查找升级Cudnn的办法,办法找到了,但是我没使用,因为我发现了这个:[cudnn报错解决]Loaded runtime CuDNN library: 7.0.5 but source was compiled with: 7.2.1.卷积神经网络学习笔记(以手写体数字识别为例)https://blog.csdn.net/jy1023408440/article/details/82887479幸好我没着急忙慌地Cudnn先升级了,万一也没解决那我一天也没了,既然主要途径就是降低Tensorflow的版本,那我直接用有Tensorflow1.12.0的虚拟环境,

卷积神经网络学习笔记(以手写体数字识别为例)

 Cudnn不升级,最后完美解决。

上一篇:转:【Java并发编程】之十二:线程间通信中notifyAll造成的早期通知问题(含代码)


下一篇:python画散点图、大箭头