参考代码:https://github.com/AlexeyAB/darknet
MSE损失
/*
计算预测的box
b.x = (i + x[index + 0 * stride]) / lw;
b.y = (j + x[index + 1 * stride]) / lh;
b.w = exp(x[index + 2 * stride]) * biases[2 * n] / w;
b.h = exp(x[index + 3 * stride]) * biases[2 * n + 1] / h;
*/
//lw,lh,输出结果的那层特征图的大小;w,h网络输入的大小;biases指anchor的大小
float tx = (truth.x*lw - i);
float ty = (truth.y*lh - j);
float tw = log(truth.w*w / biases[2 * n]);
float th = log(truth.h*h / biases[2 * n + 1]);
//计算梯度
delta[index + 0 * stride] += scale * (tx - x[index + 0 * stride]) * iou_normalizer;
delta[index + 1 * stride] += scale * (ty - x[index + 1 * stride]) * iou_normalizer;
delta[index + 2 * stride] += scale * (tw - x[index + 2 * stride]) * iou_normalizer;
delta[index + 3 * stride] += scale * (th - x[index + 3 * stride]) * iou_normalizer;
其他如L1损失,smooth L1损失。
IoU损失系列
预测框准不准,考虑和真值框的3个度量指标。
重叠面积:IoU, GIoU
中心点距离: DIoU
长宽比: CIoU
IoU
//从x, y, w, h 转为 top, left, bottom, right
boxabs pred_tblr = to_tblr(pred);
boxabs truth_tblr = to_tblr(truth);
//计算面积
float X = (pred_b - pred_t) * (pred_r - pred_l);
float Xhat = (truth_tblr.bot - truth_tblr.top) * (truth_tblr.right - truth_tblr.left);
//计算相交区域面积
float Ih = fmin(pred_b, truth_tblr.bot) - fmax(pred_t, truth_tblr.top);
float Iw = fmin(pred_r, truth_tblr.right) - fmax(pred_l, truth_tblr.left);
float I = Iw * Ih;
//计算相并区域面积,则IoU=I/U
float U = X + Xhat - I;
//对X求导
float dX_wrt_t = -1 * (pred_r - pred_l);
float dX_wrt_b = pred_r - pred_l;
float dX_wrt_l = -1 * (pred_b - pred_t);
float dX_wrt_r = pred_b - pred_t;
//对I求导
float dI_wrt_t = pred_t > truth_tblr.top ? (-1 * Iw) : 0;
float dI_wrt_b = pred_b < truth_tblr.bot ? Iw : 0;
float dI_wrt_l = pred_l > truth_tblr.left ? (-1 * Ih) : 0;
float dI_wrt_r = pred_r < truth_tblr.right ? Ih : 0;
//对U求导
float dU_wrt_t = dX_wrt_t - dI_wrt_t;
float dU_wrt_b = dX_wrt_b - dI_wrt_b;
float dU_wrt_l = dX_wrt_l - dI_wrt_l;
float dU_wrt_r = dX_wrt_r - dI_wrt_r;
//链式求导
if (U > 0 ) {
p_dt = ((U * dI_wrt_t) - (I * dU_wrt_t)) / (U * U);
p_db = ((U * dI_wrt_b) - (I * dU_wrt_b)) / (U * U);
p_dl = ((U * dI_wrt_l) - (I * dU_wrt_l)) / (U * U);
p_dr = ((U * dI_wrt_r) - (I * dU_wrt_r)) / (U * U);
}
//转为x,y,w,h的梯度
p_dx = p_dl + p_dr; //p_dx, p_dy, p_dw and p_dh are the gradient of IoU.
p_dy = p_dt + p_db;
p_dw = (p_dr - p_dl); //For dw and dh, we do not divided by 2.
p_dh = (p_db - p_dt);
GIoU
IoU损失缺陷:两个框不相交时,IoU一直为0。
增加两个框外接矩形中非重合部分所占比重,取值范围为-1到1。
参考论文:<<Generalized Intersection over Union: A Metric and A Loss for Bounding Box
Regression>>
//计算外接矩形面积
float giou_Cw = fmax(pred_r, truth_tblr.right) - fmin(pred_l, truth_tblr.left);
float giou_Ch = fmax(pred_b, truth_tblr.bot) - fmin(pred_t, truth_tblr.top);
float giou_C = giou_Cw * giou_Ch;
//对C求导
float dC_wrt_t = pred_t < truth_tblr.top ? (-1 * giou_Cw) : 0;
float dC_wrt_b = pred_b > truth_tblr.bot ? giou_Cw : 0;
float dC_wrt_l = pred_l < truth_tblr.left ? (-1 * giou_Ch) : 0;
float dC_wrt_r = pred_r > truth_tblr.right ? giou_Ch : 0;
//GIoU正则项求导,正则项=(C-U)/C
p_dt += ((giou_C * dU_wrt_t) - (U * dC_wrt_t)) / (giou_C * giou_C);
p_db += ((giou_C * dU_wrt_b) - (U * dC_wrt_b)) / (giou_C * giou_C);
p_dl += ((giou_C * dU_wrt_l) - (U * dC_wrt_l)) / (giou_C * giou_C);
p_dr += ((giou_C * dU_wrt_r) - (U * dC_wrt_r)) / (giou_C * giou_C);
DIoU,CIoU
GIoU损失缺陷:出现大框套小框时,GIoU退化成IoU。
参考论文:<<Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression>>
-
DIoU:增加两个框之间的距离度量。
- CIoU:在DIoU基础上,增加两个框长宽比一致性的度量。
其中,b表示框的中心点,ρ(·)表示欧式距离,c 表示两个框外接矩形的对角线。
α 是正的调节因子, v 表示长宽比的一致性。