最近在学习UFLDL Tutorial,这是一套关于无监督学习的教程。在此感觉Andrew Ng做的真的是非常认真。下面把我的代码贴出来,方便大家学习调试。所有代码已经过matlab调试通过。
Convolution and Pooling
本章是使用卷积神经网络进行分类。分类的图片有四种:飞机、汽车、猫、狗(如图1)。每幅图像的大小为64*64*3(彩色)。train图片2000幅,test图片3200幅。使用卷积神经网络的准确率为80%左右。这个结果还是相当不错的。
图1
代码编写
一、cnnExercise.m 主函数。包括训练特征,评测结果。
Step 0. 初始化参数。无需编写。
Step 1. 训练Sparse Autoencoder。代码:
% 读取在Linear Decoders with Autoencoders一章中训练好的权值 load STL10Features %包含optTheta, ZCAWhite, meanPatchStep 2a. 计算卷积后的图像。代码详见cnnConvolve.m,在本文后面有写。
Step 2b. 检查cnnConvolve.m。无需编写。
Step 2c. 执行Pool。作用是降维,且对图像偏移等有抑制作用。代码详见cnnPool.m,在本文后面有写。
Step 2d. 检查cnnPool.m。无需编写
Step 3. 对训练集和测试集执行卷积和Pool。无需编写。我的i-7电脑上大概花了半小时左右。算好的特征会写出到硬盘cnnPooledFeatures.mat,以后就可以直接读取并直接用其进行分类了,免得重复计算。
Step 4. 用训练集训练Softmax分类器。无需编写。
Step 5. 用测试集评测结果。无需编写。我的准确率78.56%。由于权值是随机初始化的,结果每次可能会稍有不同。
二、cnnConvolve.m 计算卷积后的图像。由于这个自己要写的部分比较散,所以我把整个.m文件都贴上来。UFLDL已经把架子搭好了,只有少部分是需要自己编写的。
function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch) %cnnConvolve Returns the convolution of the features given by W and b with %the given images % % Parameters: % patchDim - patch (feature) dimension % numFeatures - number of features % images - large images to convolve with, matrix in the form % images(r, c, channel, image number) % W, b - W, b for features from the sparse autoencoder % ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for % preprocessing % % Returns: % convolvedFeatures - matrix of convolved features in the form % convolvedFeatures(featureNum, imageNum, imageRow, imageCol) numImages = size(images, 4); imageDim = size(images, 1); imageChannels = size(images, 3); convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1); % Instructions: % Convolve every feature with every large image here to produce the % numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1) % matrix convolvedFeatures, such that % convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the % value of the convolved featureNum feature for the imageNum image over % the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1) % % Expected running times: % Convolving with 100 images should take less than 3 minutes % Convolving with 5000 images should take around an hour % (So to save time when testing, you should convolve with less images, as % described earlier) % -------------------- YOUR CODE HERE -------------------- % Precompute the matrices that will be used during the convolution. Recall % that you need to take into account the whitening and mean subtraction % steps subplot(247) imagesc(images(:,:,:,7)) subplot(248) imagesc(images(:,:,:,8)) % 变换,参考UFLDL WT = W*ZCAWhite; % -------------------------------------------------------- convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1); for imageNum = 1:numImages for featureNum = 1:numFeatures % convolution of image with feature matrix for each channel convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1); for channel = 1:3 % Obtain the feature (patchDim x patchDim) needed during the convolution % ---- YOUR CODE HERE ---- feature = zeros(8,8); % You should replace this % 当前featureNum, 当前channel的权值。size:1*64 WT_curr = WT(featureNum, (channel-1)*patchDim*patchDim+1:channel*patchDim*patchDim); feature = reshape(WT_curr, patchDim, patchDim); %size:8*8 % ------------------------ % Flip the feature matrix because of the definition of convolution, as explained later feature = flipud(fliplr(squeeze(feature))); % Obtain the image im = squeeze(images(:, :, channel, imageNum)); %获取当前imageNum当前channel图像 % Convolve "feature" with "im", adding the result to convolvedImage % be sure to do a ‘valid‘ convolution % ---- YOUR CODE HERE ---- tmp = conv2(im,feature); %计算卷积 convolvedImage = convolvedImage + tmp(patchDim:end-patchDim+1, patchDim:end-patchDim+1); %切除边缘 % ------------------------ end % Subtract the bias unit (correcting for the mean subtraction as well) % Then, apply the sigmoid function to get the hidden activation % ---- YOUR CODE HERE ---- convolvedImage = convolvedImage - WT(featureNum,:)*meanPatch + b(featureNum); %去除偏置,详见UFLDL convolvedImage = sigmoid(convolvedImage); %经过sigmoid函数 % ------------------------ % The convolved feature is the sum of the convolved values for all channels convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage; end end end function sigm = sigmoid(x) sigm = 1 ./ (1 + exp(-x)); end
三、cnnPool.m 进行Pool操作。卷积后的图像是57*57,教程中用的pool大小是19*19。这部分代码比较容易,代码:
row = floor(convolvedDim / poolDim); col = floor(convolvedDim / poolDim); for imageNum = 1:numImages for featureNum = 1:numFeatures for i1 = 1:row for j1 = 1:col tmpM = convolvedFeatures(featureNum, imageNum, (i1-1)*poolDim+1:i1*poolDim, (j1-1)*poolDim+1:j1*poolDim); pooledFeatures(featureNum, imageNum, i1, j1) = mean(mean(tmpM)); end end end end
四、RecognizeKQQ.m 自己添加的函数。由于cnnExercise.m中输出了cnnPooledFeatures,因此可以直接进行softmax的训练和测试。就不用计算那么久了。这个函数这是为了我自己方便测试用的。只要load一些数据然后把Step 4, Step 5原封不动拷贝过来就行了。代码:
close all load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels load stlTestSubset.mat % loads numTestImages, testImages, testLabels load cnnPooledFeatures %% STEP 4: Use pooled features for classification % Now, you will use your pooled features to train a softmax classifier, % using softmaxTrain from the softmax exercise. % Training the softmax classifer for 1000 iterations should take less than % 10 minutes. % Add the path to your softmax solution, if necessary % addpath /path/to/solution/ % Setup parameters for softmax softmaxLambda = 1e-4; numClasses = 4; % Reshape the pooledFeatures to form an input vector for softmax softmaxX = permute(pooledFeaturesTrain, [1 3 4 2]); softmaxX = reshape(softmaxX, numel(pooledFeaturesTrain) / numTrainImages,... numTrainImages); softmaxY = trainLabels; options = struct; options.maxIter = 200; softmaxModel = softmaxTrain(numel(pooledFeaturesTrain) / numTrainImages,... numClasses, softmaxLambda, softmaxX, softmaxY, options); %%====================================================================== %% STEP 5: Test classifer % Now you will test your trained classifer against the test images softmaxX = permute(pooledFeaturesTest, [1 3 4 2]); softmaxX = reshape(softmaxX, numel(pooledFeaturesTest) / numTestImages, numTestImages); softmaxY = testLabels; [pred] = softmaxPredict(softmaxModel, softmaxX); acc = (pred(:) == softmaxY(:)); acc = sum(acc) / size(acc, 1); fprintf(‘Accuracy: %2.3f%%\n‘, acc * 100); % You should expect to get an accuracy of around 80% on the test images.
小结
我们来总结一下网络的结构,如下图所示
接近80%的准确率还是非常不错的!