pytorch官网:PyTorch https://pytorch.org
本节内容以官方案例为例:Training a Classifier — PyTorch Tutorials 1.8.1+cu102 documentation
使用LeNet网络:
1、model.py
import torch.nn as nn import torch.nn.functional as F class LeNet(nn.Module): def __init__(self): super(LeNet,self).__init__() self.conv1 = nn.Conv2d(3,16,5) self.pool1 = nn.MaxPool2d(2,2) self.conv2 = nn.Conv2d(16,32,5) self.pool2 = nn.MaxPool2d(2,2) self.fc1 = nn.Linear(32*5*5,120) self.fc2 = nn.Linear(120,84) self.fc3 = nn.Linear(84,10) def forward(self,x): x = F.relu(self.conv1(x)) x = self.pool1(x) x = F.relu(self.conv2(x)) x = self.pool2(x) x = x.view(-1,32*5*5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x
PyTorch搭建模型:定义一个继承nn.Module的类,类中有两个方法:
- 初始化函数【__init__()】:实现搭建网络过程中的网络层结构
super(函数名,self).__init__():使用super函数的原因:
在定义类的过程中继承了nn.Module的类,super函数可以解决多重继承中可能出现的问题
Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode=‘zeros‘):
in_channels输入图像的深度(RGB为3), out_channels(输出图像的深度,即卷积核个数), kernel_size(卷积核大小)
torch.nn.
MaxPool2d
(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False):
kernel_size池化层大小,此时为2;stride步长,此时为2,默认值为kernel_size。
- 正向传播函数【forward()】:实现正向传播过程
x.view()函数:将32*5*5的矩阵展成一维矩阵,即卷积层转化为全连接层
2、train.py
import torch import torch.nn as nn import torchvision from model import LeNet import torch.optim as optim import torchvision.transforms as transforms import matplotlib.pyplot as plt import numpy as np transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))]) #下载50000张训练图片 trainset = torchvision.datasets.CIFAR10(root = ‘./data‘,train = True,download = False, transform = transform) trainloader = torch.utils.data.DataLoader(trainset,batch_size=36,shuffle=True,num_workers=0) testset = torchvision.datasets.CIFAR10(root = ‘./data‘,train = False,download = False, transform = transform) testloader = torch.utils.data.DataLoader(testset,batch_size=10000,shuffle=False,num_workers=0) test_data_iter = iter(testloader) test_image,test_label = test_data_iter.next() classes = (‘plane‘,‘car‘,‘bird‘,‘cat‘,‘deer‘,‘dog‘,‘frog‘,‘horse‘,‘ship‘,‘truck‘) #显示查看图像情况 # def imshow(img): # img = img / 2 +0.5 #unnormalize # npimg = img.numpy() # plt.imshow(np.transpose(npimg,(1,2,0))) # plt.show() # # print labels # print(‘ ‘.join(‘%5s‘ % classes[test_label[j]] for j in range(batch_size))) # # show images # imshow(torchvision.utils.make_grid(test_images)) #实例化模型 net = LeNet() #定义损失函数 loss_function = nn.CrossEntropyLoss() optimizer = optim.Adam(net.parameters(),lr=0.001) for epoch in range(5): running_loss = 0.0 for step,data in enumerate(trainloader,start=0): #get the input; data is a list of [inputs,labels] inputs ,labels =data # zero the parameter gradients optimizer.zero_grad() # forward + backward + optimize outputs = net(inputs) loss = loss_function(outputs,labels) loss.backward() optimizer.step() #print statistics running_loss += loss.item() if step % 500 == 499: #print every 500 mini-batches with torch.no_grad(): output = net(test_image) #[batch,10] predict_y = torch.max(outputs,dim=1)[1] #accuracy = (predict_y ==test_label).sum().item()/test_label.size(0) accuracy = torch.eq(predict_y, test_label).sum().item() / val_label.size(0) print(‘[%d,%sd] train_loss: %.3f test_accuracy: %.3f‘ %(epoch + 1,step + 1, running_loss / 500,accuracy)) running_loss = 0.0 print(‘Finished Training‘) save_path=‘./Lenet.pth‘ torch.save(net.state_dict(),save_path)
ToTensor():Convert a ``PIL Image`` or ``numpy.ndarray`` to tensor。将图片转为tensor
Normalize()归一化函数:使用均值和标准差归一化tensor
torch.utils.data.DataLoader(trainset,batch_size=36,shuffle=False,num_workers=0):
batch_size每个batch输入图片的数量36张;shuffle是否将数据集打乱;num_workers载入数据的线程数
iter()将数据转化为可迭代的迭代器,就可以通过next()方法提取出图像和其对应的标签值
transpose:
torch.nn.
CrossEntropyLoss
(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction=‘mean‘)This criterion combines LogSoftmax
and NLLLoss
in one single class.:该损失函数已经包含softmax函数
optimizer.zero_grad():将历史损失函数清零
with torch.no_grad():with
是python中上下文管理器,torch.no_grad()
是一个上下文管理器,被该语句 wrap 起来的部分将不会track 梯度。
3、predict.py
import torch import torchvision.transforms as transforms from PIL import Image from model import LeNet transform = transforms.Compose([transforms.Resize((32,32)),transforms.ToTensor(),transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))]) classes = (‘plane‘,‘car‘,‘bird‘,‘cat‘,‘deer‘,‘dog‘,‘frog‘,‘horse‘,‘ship‘,‘truck‘) net = LeNet() net.load_state_dict(torch.load(‘Lenet.pth‘)) im = Image.open(‘1.jpg‘) im = transform(im) im = torch.unsequeeze(im,dim=0) with torch.no_grad(): outputs = net(im) predict = torch.max(outputs,dim=1)[1] print(classes[int(predict)])