小白的自我救赎
1.Tensor简介
1.1 Tensors are similar to NumPy’s ndarrays, with the addition being that Tensors can also be used on a GPU to accelerate computing.
1.2 torch.Tensor is the central class of the package
Tsensor之于Pytorch类似ndarry之于numpy
Tensor数据间的运算,加减乘除
Tensor与numpy之间的转换
b = torch.from_numpy(a)
a = b.numpy()
CUDA Tensors
# let us run this cell only if CUDA is available
# We will use ``torch.device`` objects to move tensors in and out of GPU
if torch.cuda.is_available():
device = torch.device("cuda") # a CUDA device object
y = torch.ones_like(x, device=device) # directly create a tensor on GPU
x = x.to(device) # or just use strings ``.to("cuda")``
z = x + y
print(z)
print(z.to("cpu", torch.double)) # ``.to`` can also change dtype together!
2.自动求导
Central to all neural networks in PyTorch is the autograd package
If you set its attribute “.requires_grad as True”, it starts to track all operations on it.
requires_grad = True
x = Variable(torch.randn(5, 5))
y = Variable(torch.randn(5, 5))
z = Variable(torch.randn(5, 5), requires_grad=True)
a = x + y
a.requires_grad
b = a + z
b.requires_grad
Out:
False
True
这个标志特别有用,当想要冻结部分模型时,或者事先知道不会使用某些参数的梯度。例如,如果要对预先训练的CNN进行优化,只要切换冻结模型中的requires_grad标志就足够了,直到计算到最后一层才会保存中间缓冲区,其中的仿射变换将使用需要梯度的权重并且网络的输出也将需要它们。
只训练预训练模型部分层/参数代码
model = torchvision.models.resnet18(pretrained=True)
for param in model.parameters():
param.requires_grad = False
# Replace the last fully-connected layer
# Parameters of newly constructed modules have requires_grad=True by default
model.fc = nn.Linear(512, 100)
optimizer = optim.SGD(model.fc.parameters(), lr=1e-2, momentum=0.9)
An example of vector-Jacobian gradient computation:
x = torch.randn(3, requires_grad=True)
y = x * 2
while y.data.norm() < 1000:
y = y * 2
print(y)
v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(v)
print(x.grad)
Out:
tensor([-1350.9803, 805.9799, -188.3773], grad_fn=<MulBackward0>)
tensor([5.1200e+01, 5.1200e+02, 5.1200e-02])
注意:grad在反向传播过程中是累加的,意味着每次运行反向传播都会,梯度都会累加之前的梯度,所有反向传播前需要把梯度清零。