小白的自我救赎
损失函数
nn包下提供了几种不同的损失函数,最简单的是nn.MSELoss,计算两者之间的平均方差值
output = net(input)网络训练得到的结果
target = torch.randn(10) # a dummy target, for example,标签
target = target.view(1, -1) # make it the same shape as output
criterion = nn.MSELoss()
loss = criterion(output, target)#一定要output在前面
print(loss)
沿着loss的反向传播方向,依次用.grad_fn属性,就可以得到如下所示的计算图.
input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d
-> view -> linear -> relu -> linear -> relu -> linear
-> MSELoss
-> loss
网络权重更新
The simplest update rule used in practice is the Stochastic Gradient Descent (SGD):
weight = weight - learning_rate * gradient
为满足不同更新规则,pytorch提供了一个包:torch.optim包含了 SGD, Nesterov-SGD, Adam, RMSProp等更新策略
使用方法也比较简单
import torch.optim as optim
# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)
# in your training loop:
optimizer.zero_grad() # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step() # 更新参数
利用GPU训练网络
两部分:网络迁移到GPU,数据迁移到GPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")#看是否满足GPU运算
print(device)
net.to(device)#网络迁移
inputs, labels = data[0].to(device), data[1].to(device)#数据迁移
OUT:
cuda:0