Pytorch框架的网络结构中,所有的module都是torch.nn.Module的子类,Module中可以包含其他的Module以树状结构进行嵌套。
- 查看神经网络的各个模块
model = ResNetAttention(depth=101, pretrained=1, num_classes=16,
dropout=0, grayscale=8)
方法一:model._modules.items()
返回网络中所有模块(该模块包含子模块)的iterators,例如:
for name, module in model._modules.items():
print(name)
print(module)
--------------------------------------------------------------
# name
base
# module
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
... ...)
classifier # name
Conv2d(2048, 17, kernel_size=(1, 1), stride=(1, 1)) # module
classifiers_activation
Sequential(
(0): Conv2d(2048, 17, kernel_size=(1, 1), stride=(1, 1))
... ...
(15): Conv2d(2048, 17, kernel_size=(1, 1), stride=(1, 1))
)
activation # name
AggregatedActivation() #module
attentionmap
AttentionMap()
otsu_medthod
OtsuMethod()
obj_classifier
Linear(in_features=2048, out_features=16, bias=True)
方法二:model.children()
返回model中所有直接子模块的一个iterator,例如:
for module in model.children():
print(module)
----------------------------------------------------------------
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
... ...
Conv2d(2048, 17, kernel_size=(1, 1), stride=(1, 1))
Sequential(
(0): Conv2d(2048, 17, kernel_size=(1, 1), stride=(1, 1))
... ...)
AggregatedActivation()
AttentionMap()
OtsuMethod()
Linear(in_features=2048, out_features=16, bias=True)
# NOTE 不推荐用model.modules(), 他会递归的返回所有子模块
- 查看网络参数
一般冻结部分网络参数,是为了做transfer learning,所以我们会有一个已经训练好的model,可以先加载model并查看参数
# load pretrained model on cpu
# checkpoint = load_checkpoint('./logs11/checkpoint.pth.tar') ---- load on gpu
checkpoint = torch.load( './logs11/checkpoint.pth.tar', map_location='cpu')
model.load_state_dict(checkpoint['state_dict'])
start_epoch = checkpoint['epoch']
best_recall1 = checkpoint['best_recall1']
print("=> start epoch {} best top1 recall {:.1%}"
.format(start_epoch, best_recall1))
# check parameters
for name, module in model._modules.items():
for p in module.parameters():
# print(p)
print(p.size())
---------------------------------------------------------------
=> start epoch 100 best top1 recall 98.2%
torch.Size([64, 3, 7, 7])
torch.Size([64])
torch.Size([64])
torch.Size([64, 64, 1, 1])
... ...
torch.Size([17, 2048, 1, 1])
torch.Size([17])
torch.Size([16, 2048])
torch.Size([16])
- 冻结网路部分参数
例如,transfer learning中,我想冻结base_model的参数,只更新其他层的参数
param_optim = []
# layers = []
for name, module in model._modules.items():
if name != "base":
# layers.append(name)
for p in module.parameters():
param_optim.append(p)
else:
for p in module.parameters():
p.requires_grad = False
# print(param_optim)
# print(layers)
-------------------------------------------------------------
可以把需要更新layers打印出来看
['classifier', 'classifiers_activation', 'activation', 'attentionmap', 'otsu_medthod', 'obj_classifier']
其中有一些layers是没有参数的,不过不影响,param_optim中包含了所有需要更新的参数;其他参数
可以requires_grad = False,就不用计算它们的梯度了
- 更改optimizer的参数
# optimizer
param_groups = [{'params': param_optim, 'lr_mult': 0.1}]
if args.optimizer == 'sgd':
optimizer = torch.optim.SGD(param_groups, lr=args.lr,
momentum=args.momentum,
weight_decay=args.weight_decay,
nesterov=True)
elif args.optimizer == 'adam':
optimizer = torch.optim.Adam(param_groups, lr=args.lr, weight_decay=args.weight_decay)
else:
raise ValueError("Cannot recognize optimizer type:", args.optimizer)
以上冻结参数的方法在train.py里面进行修改即可。
还有一个小trick 可以在网络里面修改
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5)
self.conv2 = nn.Conv2d(6, 16, 5)
for p in self.parameters():
p.requires_grad=False
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
可以在中间插入requires_grad=False,插入行前面参数就是False