UFLDL学习笔记7(Working with Large Images)

最近在学习UFLDL Tutorial,这是一套关于无监督学习的教程。在此感觉Andrew Ng做的真的是非常认真。下面把我的代码贴出来,方便大家学习调试。所有代码已经过matlab调试通过。


Convolution and Pooling

本章是使用卷积神经网络进行分类。分类的图片有四种:飞机、汽车、猫、狗(如图1)。每幅图像的大小为64*64*3(彩色)。train图片2000幅,test图片3200幅。使用卷积神经网络的准确率为80%左右。这个结果还是相当不错的。

UFLDL学习笔记7(Working with Large Images)

图1

代码编写

一、cnnExercise.m 主函数。包括训练特征,评测结果。

Step 0. 初始化参数。无需编写。

Step 1. 训练Sparse Autoencoder。代码:

% 读取在Linear Decoders with Autoencoders一章中训练好的权值
load STL10Features  %包含optTheta, ZCAWhite, meanPatch
Step 2a. 计算卷积后的图像。代码详见cnnConvolve.m,在本文后面有写。

Step 2b. 检查cnnConvolve.m。无需编写。

Step 2c. 执行Pool。作用是降维,且对图像偏移等有抑制作用。代码详见cnnPool.m,在本文后面有写。

Step 2d. 检查cnnPool.m。无需编写

Step 3. 对训练集和测试集执行卷积和Pool。无需编写。我的i-7电脑上大概花了半小时左右。算好的特征会写出到硬盘cnnPooledFeatures.mat,以后就可以直接读取并直接用其进行分类了,免得重复计算。

Step 4. 用训练集训练Softmax分类器。无需编写。

Step 5. 用测试集评测结果。无需编写。我的准确率78.56%。由于权值是随机初始化的,结果每次可能会稍有不同。


二、cnnConvolve.m 计算卷积后的图像。由于这个自己要写的部分比较散,所以我把整个.m文件都贴上来。UFLDL已经把架子搭好了,只有少部分是需要自己编写的。

function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch)
%cnnConvolve Returns the convolution of the features given by W and b with
%the given images
%
% Parameters:
%  patchDim - patch (feature) dimension
%  numFeatures - number of features
%  images - large images to convolve with, matrix in the form
%           images(r, c, channel, image number)
%  W, b - W, b for features from the sparse autoencoder
%  ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for
%                        preprocessing
%
% Returns:
%  convolvedFeatures - matrix of convolved features in the form
%                      convolvedFeatures(featureNum, imageNum, imageRow, imageCol)

numImages = size(images, 4);
imageDim = size(images, 1);
imageChannels = size(images, 3);

convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);

% Instructions:
%   Convolve every feature with every large image here to produce the 
%   numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1) 
%   matrix convolvedFeatures, such that 
%   convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the
%   value of the convolved featureNum feature for the imageNum image over
%   the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1)
%
% Expected running times: 
%   Convolving with 100 images should take less than 3 minutes 
%   Convolving with 5000 images should take around an hour
%   (So to save time when testing, you should convolve with less images, as
%   described earlier)

% -------------------- YOUR CODE HERE --------------------
% Precompute the matrices that will be used during the convolution. Recall
% that you need to take into account the whitening and mean subtraction
% steps

subplot(247)
imagesc(images(:,:,:,7))
subplot(248)
imagesc(images(:,:,:,8))

% 变换,参考UFLDL
WT = W*ZCAWhite;

% --------------------------------------------------------

convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);
for imageNum = 1:numImages
  for featureNum = 1:numFeatures

    % convolution of image with feature matrix for each channel
    convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1);
    for channel = 1:3

      % Obtain the feature (patchDim x patchDim) needed during the convolution
      % ---- YOUR CODE HERE ----
      feature = zeros(8,8); % You should replace this
      
      % 当前featureNum, 当前channel的权值。size:1*64
      WT_curr = WT(featureNum, (channel-1)*patchDim*patchDim+1:channel*patchDim*patchDim);
      feature = reshape(WT_curr, patchDim, patchDim);   %size:8*8

      % ------------------------

      % Flip the feature matrix because of the definition of convolution, as explained later
      feature = flipud(fliplr(squeeze(feature)));
      
      % Obtain the image
      im = squeeze(images(:, :, channel, imageNum));    %获取当前imageNum当前channel图像

      % Convolve "feature" with "im", adding the result to convolvedImage
      % be sure to do a ‘valid‘ convolution
      % ---- YOUR CODE HERE ----

      tmp = conv2(im,feature);  %计算卷积
      convolvedImage = convolvedImage + tmp(patchDim:end-patchDim+1, patchDim:end-patchDim+1);  %切除边缘 
      
      % ------------------------
    end
    
    % Subtract the bias unit (correcting for the mean subtraction as well)
    % Then, apply the sigmoid function to get the hidden activation
    % ---- YOUR CODE HERE ----
    
    convolvedImage = convolvedImage - WT(featureNum,:)*meanPatch + b(featureNum);   %去除偏置,详见UFLDL
    convolvedImage = sigmoid(convolvedImage);   %经过sigmoid函数
    
    % ------------------------
    
    % The convolved feature is the sum of the convolved values for all channels
    convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage;
  end
end

end

function sigm = sigmoid(x)
    sigm = 1 ./ (1 + exp(-x));
end

三、cnnPool.m 进行Pool操作。卷积后的图像是57*57,教程中用的pool大小是19*19。这部分代码比较容易,代码:

row = floor(convolvedDim / poolDim);
col = floor(convolvedDim / poolDim);

for imageNum = 1:numImages
    for featureNum = 1:numFeatures
        for i1 = 1:row
            for j1 = 1:col
                tmpM = convolvedFeatures(featureNum, imageNum, (i1-1)*poolDim+1:i1*poolDim, (j1-1)*poolDim+1:j1*poolDim);
                pooledFeatures(featureNum, imageNum, i1, j1) = mean(mean(tmpM));
            end
        end
    end
end

四、RecognizeKQQ.m 自己添加的函数。由于cnnExercise.m中输出了cnnPooledFeatures,因此可以直接进行softmax的训练和测试。就不用计算那么久了。这个函数这是为了我自己方便测试用的。只要load一些数据然后把Step 4, Step 5原封不动拷贝过来就行了。代码:

close all

load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels
load stlTestSubset.mat  % loads numTestImages,  testImages,  testLabels
load cnnPooledFeatures

%% STEP 4: Use pooled features for classification
%  Now, you will use your pooled features to train a softmax classifier,
%  using softmaxTrain from the softmax exercise.
%  Training the softmax classifer for 1000 iterations should take less than
%  10 minutes.

% Add the path to your softmax solution, if necessary
% addpath /path/to/solution/

% Setup parameters for softmax
softmaxLambda = 1e-4;
numClasses = 4;
% Reshape the pooledFeatures to form an input vector for softmax
softmaxX = permute(pooledFeaturesTrain, [1 3 4 2]);
softmaxX = reshape(softmaxX, numel(pooledFeaturesTrain) / numTrainImages,...
    numTrainImages);
softmaxY = trainLabels;

options = struct;
options.maxIter = 200;
softmaxModel = softmaxTrain(numel(pooledFeaturesTrain) / numTrainImages,...
    numClasses, softmaxLambda, softmaxX, softmaxY, options);

%%======================================================================
%% STEP 5: Test classifer
%  Now you will test your trained classifer against the test images

softmaxX = permute(pooledFeaturesTest, [1 3 4 2]);
softmaxX = reshape(softmaxX, numel(pooledFeaturesTest) / numTestImages, numTestImages);
softmaxY = testLabels;

[pred] = softmaxPredict(softmaxModel, softmaxX);
acc = (pred(:) == softmaxY(:));
acc = sum(acc) / size(acc, 1);
fprintf(‘Accuracy: %2.3f%%\n‘, acc * 100);

% You should expect to get an accuracy of around 80% on the test images.

小结

我们来总结一下网络的结构,如下图所示

UFLDL学习笔记7(Working with Large Images)

接近80%的准确率还是非常不错的!

UFLDL学习笔记7(Working with Large Images)

上一篇:二叉树中节点的最大距离


下一篇:An unexpected error has been detected by HotSpot Virtual Machine: