pytorch|填一填pytorch的坑🤣（更新中）

1. RuntimeError: reduce failed to synchronize: device-side assert triggered

这个问题在调代码时困扰了两天，一度以为是自己pytorch版本太高导致的😂

分析：

错误出现位置

criterion = nn.BCELoss()
errD_real = criterion(output, label)

在对张量进行BCELoss时出错。
出错原因：

BCELoss中输入的张量value的范围必须在[0.0,1.0]之内，当输入张量超出这个范畴，BCELoss就会报错。
BCELoss中输入的张量本身就存在问题，可能output和label格式不匹配或者output输出为null(需要打印output和label看一下)。

解决方法：

网络上给了一种简单粗暴的方法：在发生错误之前加断言，

assert (label.data.cpu().numpy().all() >= 0. and label.data.cpu().numpy().all() <= 1.)

或者强制将输入张量控制在[0.0,1.0]之内，

output[output < 0.0] = 0.0
output[output > 1.0] = 1.0
label[label < 0.0] = 0.0
label[label > 1.0] = 1.0

这两种方法对我的问题并没有什么用🤕，后来自己从源头出发仔细分析了一下（可见遇到问题，从输入输出慢慢分析，不要光想google）。我的label是自己定义的1和0，不可能超出范围，output是判别器得出的结果，判别器最后经过一个sigmoid函数，范围也应该在[0.0,1.0]之间。于是我先打印了output的值，发现出现了null的结果，可见是我输入判别器的张量出了问题。输入判别器的是图片数据集，将图片、形状、类型打印出来，没有问题🤔。后来我又检测了CPU和GPU的转化，发现我用label在GPU上，而在output中输入判别器的图片张量并没有放到GPU上，于是我将判别器的输入放到GPU上，问题解决🌹。

2. RuntimeError: The size of tensor a (53) must match the size of tensor b (64) at non-singleton dimension 0

分析:

错误出现在1个epoch以后，如下：

[0/500][14/16] Loss_D: 0.0661 Loss_G: 51.4269 / 1.8594 l_D(x): 0.9845 l_D(G(z)): 0.0000
Traceback (most recent call last):
  File "train.py", line 207, in <module>
    cropped = mask1 * real_1
RuntimeError: The size of tensor a (41) must match the size of tensor b (64) at non-singleton dimension 0

主要是因为我的数据集大小不能被我设置的batchsize所整除。

解决方法：

pytorch中的torch.utils.data.DataLoader中的drop_last (bool, optional)已经很好的解决了这个问题，如果数据集大小不能被批大小整除，则设置为True以除去最后一个未完成的批。如果False那么最后一批将更小。（默认为False）

dataloader = torch.utils.data.DataLoader(dataset, batch_size=64,
shuffle=True, num_workers=2, drop_last=True)

3. RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 1 and 3 in dimension 1 at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/TH/generic/THTensor.cpp:711

这个问题还是很明显的，就是数据维度不匹配。

分析：

出现这个问题后，首先用opencv读取数据打印数据维度，发现都是(256,256,3)。这里就有点小坑，后来又考虑到了代码是用imageio的imread读取的数据，打印数据维度果然不一样。
注意：opencv.imread()读图像，读进来直接是BGR 格式数据格式在 0~255。

解决方法：

找出维度不相等的，这里其他维度都是(256,256)，有部分是(256,256,3)。即需要把彩色图转成灰度图。

from skimage import color
img=color.rgb2gray(img)
img = img * 255

pytorch|填一填pytorch的坑🤣（更新中）

1. RuntimeError: reduce failed to synchronize: device-side assert triggered

分析：

解决方法：

2. RuntimeError: The size of tensor a (53) must match the size of tensor b (64) at non-singleton dimension 0

分析:

解决方法：

3. RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 1 and 3 in dimension 1 at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/TH/generic/THTensor.cpp:711

分析：

解决方法：

推荐阅读更多精彩内容