CNN卷积特征可视化

2023-11-06 16:22:28

CNN卷积特征可视化

可视化准备工作：
我们将要进行的工作包括：
创建CNN特征提取器，本文使用PyTorch自带的resnet34
创建一个保存hook内容的对象
为每个卷积层创建hook
导入需要使用的库

对以下图片进行可视化

用到的python库

import numpy as np

import torch
import torchvision
from PIL import Image
from torchvision import transforms as T

import matplotlib.pyplot as plt

创建CNN特征提取器

import torch
import torchvision

feature_extractor = torchvision.models.resnet34(pretrained=True)
if torch.cuda.is_available():
	feature_extractor.cuda()

device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

创建保存hook内容的对象

class SaveOutput:
	def __init__(self):
		self.outputs = []
	def __call__(self, module, module_in, module_out):
		self.outputs.append(module_out)
	def clear(self):
		self.outputs=[]
		
save_output = SaveOutput()

为卷积层注册hook

hook_handles = []

for layer in feature_extractor.modules():
	if isinstance(layer, torch.nn.Conv2d):
		handle = layer.register_forward_hook(save_output)
		hook_handles.append(handle)

读取图像并进行特征提取

from PIL import Image
from torchvision import transforms as T

image = Image.open('cat.jpg')
transform = T.Compose([T.Resize((224, 224)), T.ToTensor()])
X = transform(image).unsqueeze(dim=0).to(device)

out = feature_extractor(X)

查看卷积层特征提取效果

卷积层共有1+6+(42+1)+(62+1)+(32+1)=36个，对conv3_x层有42+1卷积层的原因是（1）四个basicblock本身有4*2个卷积层（2）其中一个basicblock进行了downsample，又多了一个卷积层

可视化哪些卷积层

对于resnet34来说，我们计划可视化其第1、2、15、28个卷积层

第一个卷积层是conv1_x的输出，图片轮廓较为清楚
第二、七个卷积层是conv2_x首个和末尾卷积层的输出，我们将其与第一个卷积层输出对比可以得到特征逐渐高层化的结论
第十五个卷积层是conv3_x的输出
第二十八个卷积层是conv4_x的输出

为何不可视化最后一个卷积层？

对于最后一个卷积层，其每个通道的像素仅仅为7x7，可视化也看不出什么东西。或者说我们可视化第二十八个卷积层后，就发现继续可视化没有必要了。

提取计划可视化的卷积层结果

每个卷积层的结果都通过hook保存到了save_output.outputs里面，我们查看是否为36个结果
我们创建一个拼接卷积结果的函数。对每个卷积层来说，其结果都是由许多单通道图片组成（比如第一个卷积层的通道为64，因此有64张单通道图片），因此我们首先需要将这些单通道图片进行拼接一张单通道大图。

临时查看

我们查看是否为36个结果并查看计划可视化的层的shape

print(len(save_output.outputs))
a_list = [0, 1, 6, 15, 28, 35]
for i in a_list:
    print(save_output.outputs[i].cpu().detach().squeeze(0).shape)

拼接函数

def grid_gray_image(imgs, each_row: int):
    '''
    imgs shape: batch * size (e.g., 64x32x32, 64 is the number of the gray images, and (32, 32) is the size of each gray image)
    '''
    row_num = imgs.shape[0]//each_row
    for i in range(row_num):
        img = imgs[i*each_row]
        img = (img - img.min()) / (img.max() - img.min())
        for j in range(1, each_row):
            tmp_img = imgs[i*each_row+j]
            tmp_img = (tmp_img - tmp_img.min()) / (tmp_img.max() - tmp_img.min())
            img = np.hstack((img, tmp_img))
        if i == 0:
            ans = img
        else:
            ans = np.vstack((ans, img))
    return ans

提取计划可视化的卷积层结果

img0 = save_output.outputs[0].cpu().detach().squeeze(0)
img0 = grid_gray_image(img0.numpy(), 8)
img1 = save_output.outputs[1].cpu().detach().squeeze(0)
img1 = grid_gray_image(img1.numpy(), 8)
img6 = save_output.outputs[6].cpu().detach().squeeze(0)
img6 = grid_gray_image(img6.numpy(), 8)
img15 = save_output.outputs[15].cpu().detach().squeeze(0)
img15 = grid_gray_image(img15.numpy(), 16)
img29 = save_output.outputs[28].cpu().detach().squeeze(0)
img29 = grid_gray_image(img29.numpy(), 16)

对第1层进行可视化

plt.figure(figsize=(15, 15))
plt.imshow(img1, cmap='gray')

对第29层进行可视化

plt.figure(figsize=(15, 15))
plt.imshow(img29, cmap='gray')

码农公寓