一般来说,大家使用VGG16,用的是第四列的网络架构,而使用VGG19,使用的就是第六列的网络架构。
使用vgg进行提取特征,在这个项目中,使用的就是每一块卷积层的第一层。
import torch.nn as nn
from torchvision import models
from torchvision.models.vgg import VGG19_Weights
class VGGNet(nn.Module):
def __init__(self):
super(VGGNet, self).__init__()
self.select = ['0', '5', '10', '19', '28']
# self.vgg = models.vgg19(pretrained=True).features # .features用于提取卷积层
self.vgg = models.vgg19(weights=VGG19_Weights.IMAGENET1K_V1).features
def forward(self, x):
features = []
for name, layer in self.vgg._modules.items():
x = layer(x) # name为第几层的序列号,layer就是卷积层,,x为输入的图片。x = layer(x)的意思是,x经过layer层卷积后再赋值给x
if name in self.select:
features.append(x)
return features
net = VGGNet()
print(net)
我们打印了一下我们定义的net
VGGNet(
(vgg): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(17): ReLU(inplace=True)
(18): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(19): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(24): ReLU(inplace=True)
(25): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(26): ReLU(inplace=True)
(27): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace=True)
(30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(31): ReLU(inplace=True)
(32): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(33): ReLU(inplace=True)
(34): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(35): ReLU(inplace=True)
(36): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
)
讲解
特征提取列表
self.select = ['0', '5', '10', '19', '28']
第一次看代码的时候,会被这个列表给迷惑住,VGG19不是只有19层吗,19层指的是除去maxpool,softmax这些层以外,卷积层和全连接层这些包含参数的层,共有19层。
但是,实际过程中,ReLU激活函数,也算在里面,因此,上文打印出来,会超过19层,因此,通过对应的索引,可以找到每块卷积的第一层。
forward函数
最简单的解释,就是将x逐层喂入神经网络,当经过的这个层刚好是每块卷积层的第一层,就将经过这层的结果保存到列表中,该结果中保存的就是图片的特征。
我们调试一下看看
import torch.nn as nn
import torch
from torchvision import models
from torchvision.models.vgg import VGG19_Weights
class VGGNet(nn.Module):
def __init__(self):
super(VGGNet, self).__init__()
self.select = ['0', '5', '10', '19', '28']
# self.vgg = models.vgg19(pretrained=True).features # .features用于提取卷积层
self.vgg = models.vgg19(weights=VGG19_Weights.IMAGENET1K_V1).features
# self.vgg = models.vgg19(weights=VGG19_Weights.IMAGENET1K_V1).features
def forward(self, x):
features = []
for name, layer in self.vgg._modules.items():
x = layer(x) # name为第几层的序列号,layer就是卷积层,,x为输入的图片。x = layer(x)的意思是,x经过layer层卷积后再赋值给x
if name in self.select:
features.append(x)
return features
net = VGGNet()
print(net)
input_tensor = torch.randn(1, 3, 256, 256)
output = net(input_tensor)
print(output)
可以看到,里面存放的是tensor格式的数据。
所以,经过这个网络,提取了图片的特征。