上节主要是线性回归的手动实现,每次只处理一个数据。本次我们使用torch中的方法进行批量处理
数据获取
为了便于分析,我们使用y = 5x+6模拟生成一些数据
import torch as tt
from IPython import display
from matplotlib import pyplot as plt
import numpy as np
import random
num_inputs = 1#特征个数,这里是验证一元线性回归,因此只有一个
num_examples = 2000#样本数量
true_w = 5
true_b = 6
features = tt.from_numpy(np.random.normal(0,1,(num_examples, num_inputs)))
labels = true_w * features[:,0] + true_b
features.shape , labels.shape
(torch.Size([2000, 1]), torch.Size([2000]))
增加一些噪声数据表示干扰数据
labels +=tt.from_numpy(np.random.normal(0,1, size = labels.size()))
features 的每一行是一个长度为1的向量量,因为我们的权重w是一个数字,而labels的每一行是一个长度为1的向量(标量)
plt.plot([-3, 3],[true_w*-3 + true_b,true_w*3 + true_b])#实际直线
plt.scatter(features[:,0].numpy(), labels.numpy(),1, c='#ff0000')#待预测点
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-rEuLsPpx-1639569062935)(output_7_1.png)]
初始化权重
w是一行一列的特征权重数据,b是单个数据
w = tt.tensor(np.random.normal(0, 0.01, (num_inputs, 1)),dtype=tt.float32).double()
b = tt.zeros(1, dtype=tt.float32).double()
w.requires_grad_(requires_grad=True)
b.requires_grad_(requires_grad=True)
通过batch_size批量获取数据
def data_iter(batch_size, features, labels):
num_examples = len(features)
indices = list(range(num_examples))
random.shuffle(indices)
for i in range(0, num_examples, batch_size):
j = tt.LongTensor(indices[i: min(i + batch_size,num_examples)])
yield features.index_select(0, j), labels.index_select(0,j)
通过torch函数定义一元线性方程
def linreg(X, w, b):
return tt.mm(X, w) + b
定义损失函数
损失函数,与上一节有差异,没有除以n
def squared_loss(y_pre, y):
# 注意这⾥里里返回的是向量量, 另外, pytorch⾥里里的MSELoss并没有除以 2
return (y_pre - y.view(y_pre.size())) ** 2 / 2
优化函数
通过梯度下降调整权重值
def sgd(params, lr, batch_size):
for param in params:
param.data -= lr * param.grad / batch_size # 注意这⾥里里更更改param时⽤用的param.data
训练
lr = 0.03
batch_size = 1000
num_epochs = 3
net = linreg
loss = squared_loss
for epoch in range(num_epochs):
for X, y in data_iter(batch_size, features, labels):
l = loss(net(X, w, b), y).sum() # l是有关⼩小批量量X和y的损失
l.backward() # ⼩小批量量的损失对模型参数求梯度
sgd([w, b], lr, batch_size) # 使⽤用⼩小批量量随机梯度下降迭代模型参数
# 不不要忘了了梯度清零
w.grad.data.zero_()
b.grad.data.zero_()
train_l = loss(net(features, w, b), labels)
print('epoch {}, loss {},{}, {}'.format(epoch + 1, train_l.mean().item(), w.item(),b.item()))
epoch 1, loss 0.4913282449877688,4.965234125341314, 5.979345414047706
epoch 1, loss 0.49133355745945917,4.964254734703085, 5.979943022778341
epoch 2, loss 0.491332443959815,4.964159786877889, 5.97982659736746
epoch 2, loss 0.49132803866147595,4.964265122652913, 5.979360553763988
epoch 3, loss 0.491326351358505,4.9636673602931145, 5.979148546470734
epoch 3, loss 0.49132317816047677,4.964289414714468, 5.978815212481991
迭代3次以后,预测值与真实值基本一致