pytorch学习记录

FPGA硅农发布时间：2020-09-29 15:13:51 ，浏览量：2

一

model.train()和model.eval()的差别： model.train()会启动BatchNormalization和DropOut操作，而后者则不会启用。训练完train样本后，生成的模型model要用来测试样本。在model(test)之前，需要加上model.eval()，否则的话，有输入数据，即使不训练，它也会改变权值。这是model中含有batch normalization层所带来的的性质。使用PyTorch进行训练和测试时一定注意要把实例化的model指定train/eval，eval（）时，框架会自动把BN和DropOut固定住，不会取平均，而是用训练好的值，不然的话，一旦test的batch_size过小，很容易就会被BN层导致生成图片颜色失真极大！！！！！！

二

pytorch自定义函数

# -*- coding:utf8 -*-

import torch
from torch.autograd import Variable

class MyReLU(torch.autograd.Function):

    def forward(self, input_):
        # 在forward中，需要定义MyReLU这个运算的forward计算过程
        # 同时可以保存任何在后向传播中需要使用的变量值
        self.save_for_backward(input_)         # 将输入保存起来，在backward时使用
        output = input_.clamp(min=0)           # relu就是截断负数，让所有负数等于0
        return output

    def backward(self, grad_output):
        # 根据BP算法的推导（链式法则），dloss / dx = (dloss / doutput) * (doutput / dx)
        # dloss / doutput就是输入的参数grad_output、
        # 因此只需求relu的导数，在乘以grad_outpu
        input_, = self.saved_tensors
        grad_input = grad_output.clone()
        grad_input[input_  OrderedDict([('weight', Parameter containing:
              tensor([[0.4142, 0.0424],
                      [0.3940, 0.0796]], requires_grad=True)),
             ('bias', Parameter containing:
              tensor([-0.2885,  0.5825], requires_grad=True))])
			  
# 读取参数的方式二（推荐这种）
for n, p in fc.named_parameters():
	print(n,p)
>>>weight Parameter containing:
tensor([[0.4142, 0.0424],
        [0.3940, 0.0796]], requires_grad=True)
bias Parameter containing:
tensor([-0.2885,  0.5825], requires_grad=True)

# 读取参数的方式三
for p in fc.parameters():
	print(p)
>>>Parameter containing:
tensor([[0.4142, 0.0424],
        [0.3940, 0.0796]], requires_grad=True)
Parameter containing:
tensor([-0.2885,  0.5825], requires_grad=True)

六

optimizer.param_groups：是长度为2的list，其中的元素是2个字典； optimizer.param_groups[0]：长度为6的字典，包括[‘amsgrad’, ‘params’, ‘lr’, ‘betas’, ‘weight_decay’, ‘eps’]这6个参数 optimizer.param_groups[1]：好像是表示优化器的状态的一个字典

七

for param_tensor in model.state_dict():
        #打印 key value字典
        print(param_tensor,'\t',model.state_dict()[param_tensor].size())

通过上述state_dict()方法，不仅可以得到模型中可训练的参数，还能得到不可训练的参数，如BN层中的running_mean和running_var。

关注

打赏

1688896170

查看更多评论

pytorch学习记录

[ 申请 ]友情链接：