【MATLAB深度学习】采用 Deeplab v3+ 实现全景分割

2024-03-08 11:54:34

语义分割网络对图像中的每个像素进行分类，从而对图像进行分割。语义分割的应用包括用于自动驾驶的道路分割和用于医疗诊断的癌细胞分割。本文展示了如何使用 MATLAB 训练语义分割网络 Deeplab v3+，实现了自动驾驶场景下的全景分割。

本例使用剑桥大学的CamVid数据集进行训练。这个数据集是一个图像集合，包含驾驶时获得的街道视图。该数据集提供了32个语义类的像素级标签，包括汽车、行人和道路。如下图所示：

文章目录

一、下载预训练模型

这个例子创建了Deeplab v3+网络，权值由预先训练的Resnet-18网络初始化。ResNet-18是一个高效的网络，非常适合处理资源有限的应用程序。根据应用需求，还可以使用其他预先训练过的网络，如MobileNet v2或ResNet-50。

使用ResNet-18前，需要打开附加功能资源管理器，并点击安装Deep Learning Toolbox Model for ResNet-18 Network。

安装完后，可以开始下载预训练模型：

if  ~exist('pretrainedNetwork/deeplabv3plusResnet18CamVid.mat','file')
    disp('Downloading pretrained network (58 MB)...');
    pretrainedURL = 'https://www.mathworks.com/supportfiles/vision/data/deeplabv3plusResnet18CamVid.mat';
    websave('pretrainedNetwork/deeplabv3plusResnet18CamVid.mat', pretrainedURL);
end

二、数据集准备

1、数据集下载

本次实验使用的是CamVid数据集，下面我们进行下载和解压：

imageURL = 'http://web4.cs.ucl.ac.uk/staff/g.brostow/MotionSegRecData/files/701_StillsRaw_full.zip';
labelURL = 'http://web4.cs.ucl.ac.uk/staff/g.brostow/MotionSegRecData/data/LabeledApproved_full.zip';
 
outputFolder = 'CamVid'; 
labelsZip = fullfile(outputFolder,'labels.zip');
imagesZip = fullfile(outputFolder,'images.zip');

if ~exist(outputFolder,'file')  
    mkdir(outputFolder); 
end

if ~exist(labelsZip, 'file') 
    disp('Downloading 16 MB CamVid dataset labels...'); 
    websave(labelsZip, labelURL);
    disp('Complete Download CamVid dataset labels!'); 
    unzip(labelsZip, fullfile(outputFolder,'labels'));
    disp('Complete Unzip CamVid dataset labels!'); 
end

if ~exist(imagesZip,'file')  
    disp('Downloading 557 MB CamVid dataset images...');  
    websave(imagesZip, imageURL);       
    disp('Complete Download CamVid dataset images!'); 
    unzip(imagesZip, fullfile(outputFolder,'images')); 
    disp('Complete Unzip CamVid dataset images!'); 
end

2、加载 CamVid 图像

imgDir = fullfile(outputFolder,'images');
imds = imageDatastore(imgDir);
Img = readimage(imds,559);
Img = histeq(Img);
imshow(Img)

3、创建图像、标签数据存储

classes = [
    "Sky"
    "Building"
    "Pole"
    "Road"
    "Pavement"
    "Tree"
    "SignSymbol"
    "Fence"
    "Car"
    "Pedestrian"
    "Bicyclist"
    ];

% 标签转换
labelIDs = camvidPixelLabelIDs();
% 使用类和标签id创建pixelLabelDatastore
labelDir = fullfile(outputFolder,'labels');
pxds = pixelLabelDatastore(labelDir,classes,labelIDs);

4、划分数据集

[imdsTrain, imdsVal, pxdsTrain, pxdsVal] = partitionCamVidData(imds, pxds);

二、创建 Deeplab v3+ 网络

1、参数初始化

% 输入图像尺寸
imageSize = [720 960 3];

% 类别数
numClasses = numel(classes);

2、创建网络

% Create DeepLab v3+.
lgraph = deeplabv3plusLayers(imageSize, numClasses, "resnet18");

三、训练网络

1、使用类权重来平衡类

tbl = countEachLabel(pxds);
imageFreq = tbl.PixelCount ./ tbl.ImagePixelCount;
classWeights = median(imageFreq) ./ imageFreq;

pxLayer = pixelClassificationLayer('Name','labels','Classes',tbl.Name,'ClassWeights',classWeights);
lgraph = replaceLayer(lgraph,"classification",pxLayer);

2、设置训练选项

dsVal = combine(imdsVal,pxdsVal);
dsTrain = combine(imdsTrain, pxdsTrain);

% Define training options. 
options = trainingOptions('sgdm', ...
    'LearnRateSchedule','piecewise',...
    'LearnRateDropPeriod',10,...
    'LearnRateDropFactor',0.3,...
    'Momentum',0.9, ...
    'InitialLearnRate',1e-3, ...
    'L2Regularization',0.005, ...
    'ValidationData',dsVal,...
    'MaxEpochs',30, ...  
    'MiniBatchSize',8, ...
    'Shuffle','every-epoch', ...
    'CheckpointPath', tempdir, ...
    'VerboseFrequency',2,...
    'Plots','training-progress',...
    'ValidationPatience', 4);

3、开始训练

doTraining = false;
if doTraining    
    [net, info] = trainNetwork(dsTrain,lgraph,options);
else
    data = load(pretrainedNetwork); 
    net = data.net;
end

四、全景分割测试

I = readimage(imdsVal,35);
C = semanticseg(I, net);

cmap = camvidColorMap;
B = labeloverlay(I,C,'Colormap',cmap,'Transparency',0.4);
imshow(B);
pixelLabelColorbar(cmap, classes);

测试效果如下：

戳戳下方公众号，更多干货第一时间送达！

码农公寓