keras, pytorch,tf 的学习笔记

学习笔记

assert 加judge 判断式

tensor.item() 取一个元素

tensor.index_select() tensor行列位置选择

model.state_dict(path) 从保存model的w、input-x等tensor，按照dict保存

对应的->model.load_state_dict(path)加载模型

###model 和train 相关

model.eval() 测试模式，主要针对dropout batch_normalizetion

model.train()训练模式

model.eval 和 model.train 是训练模式和测试模式的切换

loss.backforwad()

loss 其实是一个函数，model.forward的layer，所以有反向传播的功能

optimizer.step()

梯度下降的调度

optimizer.step 和 loss.backforwad 都是一个step调度的，也就是1个batch

optimizer.step()

###clip_grad_norm:

梯度裁剪，经过笔者阅读源代码，是规定一个max_norm:(clip_grad), loss.backward之后，网络梯度已经计算出来，clip_grad_norm的input是model.paragrams, max_norm, norm_type, 是一种裁剪策略: 若max_norm/[p*norm_type for p in model.paragrams]

##分布式

torch.nn.parallel.DistributedDataParallel(model.cuda())

##tensor:

torch相关：

tensor.contiguous().view(-1, self.output_dim)(这是flatten，也就是维度重组)

如果是tf相关就是：

tf.reshape(tensor, [-1, self.output_dim])(这也是flatten)

torch相关的调用主体一般是直接的tensor；而torch的调用主体一般就是tf了.

***用法请注意：

torch的主体一般是tensor；tensor模式

tensorflow的主题一般就是tf，图模式

另外，view的参数x.view(0，index）相当于x[index],(2,index)相当于x【；，；，index】

而对应的tensorflow的参数是tf.reshape(x,[0,index]),也就是有一个结论:

tf的[]和torch的（）是差不多的

tensor初始化：

torch.LongTensor(batch_size,max_len).fill_(value)

填充特定value：0，1，2都行

tensor.size() <==> tf.get_shape()

也可以这么使用：tensor.size(0)

tensor.cuda() 加载到gpu.

torch的layer函数与keras的layer函数的设计是相似的，都是layer(param)(x,y),seq只是将layer封装入一个sequence类的方法，用法就是，seq(x，y)把layer的param都封装进去了，too simple

##layer

torch 与 keras 的layer的习惯差不多

torch的api是nn，相对的keras的api是layer，都是用第一个（）做init，第一个（）做train或者预测

torch的cnn与keras欠缺的地方，是没有padding=‘same’， conv1d使用的时候，如果需要input与output的seq相同，需要自己计算padding的值。

##torch 及 tensorflow device相关：

其一:torch.cuda.set_device(gpu_use)

然后再用variabel.cuda()

其二：torch.tensor.to(device对象)

device = torch.device("cuda")

如果是tensorflow的话：

1. conf = tf.ConfigProto(device_count={'GPU': 0})

tf.enable_eager_execution(config=conf)

2. with tf.device(/cpu:0):(tf也有device对象) 但是不像torch一样有cuda对象

##keras 自己封装model

from keras.engine.topology import Layer

class MineModel(Layer):

new_model = multi_gpu_model(model, gpus=len(os.environ["CUDA_VISIBLE_DEVICES"].split(',')))

##pandas相关：

dataframe是可以通过列的numpy或list赋值的

举例说明: pd_array['id'] = id_list

##有关链接：

###keras

https://www.jianshu.com/p/25e30055d7ac

https://keras-cn.readthedocs.io/en/latest/models/model/

https://www.jianshu.com/p/b9ad6b26e690

keras, pytorch,tf 的学习笔记

推荐阅读更多精彩内容