1.
y = [1, 0.3, 0.7, 0.3, 0.3, 0, 1, 0]
y = [1, 0.7, 0.5, 0.3, 0.3, 0, 1, 0]
y = [1, 0.3, 0.7, 0.5, 0.5, 0, 1, 0]
y = [1, 0.3, 0.7, 0.5, 0.5, 1, 0, 0]
y = [0, 0.2, 0.4, 0.5, 0.5, 0, 1, 0]
因为有物体所以pc= 1,然后根据物体中点在图像中的位置,确定后面四个位置参数,然后后面三个是类别 c2=1
2.
y=[1,?,?,?,?,0,0,0]
y=[0,?,?,?,?,0,0,0]
y=[?,?,?,?,?,?,?,?]
y=[0,?,?,?,?,?,?,?]
y=[1,?,?,?,?,?,?,?]
pc表示的是否为物体,这里没有要的,所以为0 ,此项为0 后 后面的都是问号
- You are working on a factory automation task. Your system will see a can of soft-drink coming down a conveyor belt, and you want it to take a picture and decide whether (i) there is a soft-drink can in the image, and if so (ii) its bounding box. Since the soft-drink can is round, the bounding box is always square, and the soft drink can always appears as the same size in the image. There is at most one soft drink can in each image. Here’re some typical images in your training set:
Logistic unit (for classifying if there is a soft-drink can in the image)
Logistic unit, bx and by
Logistic unit, bx, by, bh (since bw = bh)
Logistic unit, bx, by, bh, bw
做错了,不是第三项,自己觉得因为是圆形,所以bh和bw只要一个就可以了。查一下
嗯,也不是第四个,难道是第二个么? 明天再试一下。对,是第二个选项,估计是因为这个圆形的大小已经知道了,所以不需要长宽这两个参数
4.If you build a neural network that inputs a picture of a person’s face and outputs N
landmarks on the face (assume the input image always contains exactly one face), how
关键点检测,确定他的位置(x,y)两个参数,所以2N
5.When training one of the object detection systems described in lecture, you need a training set that contains many pictures of the object(s) you wish to detect. However, bounding boxes do not need to be provided in the training set, since the algorithm can learn to detect the objects by itself.
True
False
只有测试的时候才不要,训练的时候要位置信息,不然你让人家怎么学
Suppose you are applying a sliding windows classifier (non-convolutional implementation). Increasing the stride would tend to increase accuracy, but decrease computational cost.
True
False
确实是错的,自己看错了题目,stride 变大了,acc肯定降低,计算量也降低。In the YOLO algorithm, at training time, only one cell ---the one containing the center/midpoint of an object--- is responsible for detecting this object.
True
False
在训练的时候,每一个物体是在他中心点的那个cell来标记。
8.What is the IoU between these two boxes? The upper-left box is 2x2, and the lower-right box is 2x3. The overlapping region is 1x1.
1/6
1/9
1/10
None of the above
做错了,确实不是1/6自己的理解有误,不是重叠的面积/预测框的面积,是交集/并集,怪不得自己觉得没有正确答案。
9.Suppose you run non-max suppression on the predicted boxes above. The parameters
you use for non-max suppression are that boxes with probability <=0.4 are discarded,
and the IoU threshold for deciding if two boxes overlap is 0.5. How many boxes will
remain after non-max suppression?
3
4
5
6
7
最大值抑制,首先,先抛弃pc< 0.4 的,右下车被抛弃,然后对每一类都找最大的,如果和最大的iou超过0.5 也抛弃。car 0.62被抛弃,重叠度太高,tree0.46被保留,重叠度不到0.5,所以最后为5.
10.Suppose you are using YOLO on a 19x19 grid, on a detection problem with 20 classes,
and with 5 anchor boxes. During training, for each image you will need to construct an
output volume as the target value for the neural network; this corresponds to the last
layer of the neural network. ( may include some “?”, or “don’t cares”). What is the
dimension of this output volume?
19x19x(25x20)
19x19x(5x25)
19x19x(5x20)
19x19x(20x25)
做错了,不是19x19x(5x20),自己就是觉得奇怪,当时脑子估计进了水了。pc+四个位置参数+20类,当然是25了。