【目标检测】损失函数：不同损失函数概念及其代码实现

发布时间：2024年01月22日

本篇文章介绍目标检测中不同的损失函数概念及其代码实现。目标检测主要任务为实现目标的分类与定位，其损失组成如下：

类别/置信度损失（分类任务）：BCE，FL，QFL，VFL
位置损失（回归任务)：IoU，GIoU，DIoU，CIoU，DFL(分类)

本文目录

类别/置信度损失
位置损失

类别/置信度损失

BCE

??二值交叉熵(Binary Cross-Entropy, BCE)是一种应用于二分类任务中的损失函数，用于衡量目标类别预测值和实际值之间的差距，其计算公式如下：
${BCE}(y,p) = - y\log (p) - (1 - y)\log (1 - p)$
其中 $y$ 表示目标的实际类别，值为0或1， $p$ 为目标的预测类别，值为[0，1]，进一步地，BCE Loss可表示为以下形式：
${BCE}(p_t) = - log (p_t)$
$p_t= \begin{cases} p，y=1 \\ 1-p，otherwise\ \end{cases}$
针对多类别任务，可通过独热编码将其分解为多个二分类任务的组合再使用BCE Loss。
??BCE在PyTorch中的实现如下所示：

'''
函数实现:
	binary_cross_entropy_with_logits:Sigmoid + BCE
	binary_cross_entropy: BCE
'''
torch.nn.functional.binary_cross_entropy_with_logits(
						input=None,  # 预测值
						target=None,  # 实际标签
						weight=None,  # 对每个样本的损失进行加权
						size_average=None,  # 已弃用
						reduce=None,       # 不使用
						pos_weight=None,  # 正样本的损失加权(长度等于类数)
						reduction='mean'  # 所有样本的损失求平均(mean)或求和(sum)
						)
'''
类实现(调用上面的函数实现损失计算)
'''
torch.nn.BCEWithLogitsLoss(weight,pos_weight,reduction)

Focal Loss

??Focal Loss(FL)由文章Focal Loss for Dense Object Detection提出。Focal Loss在BCE Loss的基础上，通过权重系数实现以下两点目的：

解决正负样本不平衡问题：目标检测任务中存在大量的背景(负样本)，实际目标(正样本)占比减少
${BCE}(p_t) = - α_tlog (p_t)$
$α_t= \begin{cases} α，y=1 \\ 1-α，otherwise\ \end{cases}$
其中 $α$ 用于控制正负样本的权重。
降低易分类样本的权重：使模型训练更加关注于困难样本
${FL}(p_t) = -(1-p_t)^γlog (p_t)$
其中 $γ$ 用于控制难易分类样本的权重， $p_t$ 越大，则该样本越易分类，则对损失的贡献越小。
??结合以上两点，得到最终的Focal Loss公式如下：
${FL}(p_t) = -α_t(1-p_t)^γlog (p_t)$

${FL}(y,p) = - α(1-p)^γy\log (p) -(1-α) p^γ(1 - y)\log (1 - p)$

??Focal Loss的实现方法如下：

class FocalLoss(nn.Module):
    '''
    用在代替原来的BCEcls(分类损失)和BCEobj(置信度损失)
    优点:
        1.解决了单阶段目标检测中图片正负样本(前景和背景)不均衡的问题;
        2.降低简单样本的权重, 使损失函数更关注困难样本
    '''
    def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
        super(FocalLoss, self).__init__()
        self.loss_fcn = loss_fcn  # 必须为nn.BCEWithLogitsLoss = sigmoid + BCELoss
        self.gamma = gamma  # 参数γ用于削弱简单样本对loss的贡献程度
        self.alpha = alpha  # 参数α用于平衡正负样本个数不均衡的问题
        self.reduction = loss_fcn.reduction
        self.loss_fcn.reduction = 'none'  # focalloss中的BCE函数的reduction='none', 需要将focal loss应用到每个样本中

    def forward(self, pred, true):
        loss = self.loss_fcn(pred, true)  # BCE(p_t) = -log(p_t)
        pred_prob = torch.sigmoid(pred)
        p_t = true * pred_prob + (1 - true) * (1 - pred_prob)  # p_t
        alpha_factor = true * self.alpha + (1 - true) * (1 - self.alpha)  # α_t
        modulating_factor = (1.0 - p_t) ** self.gamma  # (1-p_t)^γ
        
        loss *= alpha_factor * modulating_factor  # 损失乘上系数

        # 最后选择focalloss返回的类型 默认是mean
        if self.reduction == 'mean':
            return loss.mean()
        elif self.reduction == 'sum':
            return loss.sum()
        else:  # 'none'
            return loss

Quality Focal Loss

??Quality Focal Loss(QFL)由文章Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection提出，其与Focal Loss的主要不同为：

将分类分数与检测框IoU结合，提出一种与IoU关联的软标签，将目标实际标签变为连续值
- soft ont-hot label(iou label)：IoU值表示该类别，0表示其他类别
- ont-hot label(catgory label)：1表示该类别，0表示其他类别
修改了Focal Loss中的难易样本分类权重，用实际标签与预测标签的距离表征样本的分类难易度

??Quality Focal Loss公式如下：
${QFL}(y,p) = - |y-p|^γ(αy\log (p) -(1-α)(1 - y)\log (1 - p))$
其中p表示目标的soft ont-hot label(iou label)，超参数 $γ 、 α$ 与Focal Loss中概念一致， $y$ 与 $p$ 越接近，则该样本越易分类，对损失的贡献则越小。
??Quality Focal Loss实现方法如下：

class QualityFocalLoss(nn.Module):
	'''
	相比Focal Loss的变化：
		1.以目标与预测结果的IoU作为实际标签(软标签)
		2.修改了难易样本的权重计算方法
	'''
    def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
        super(QFocalLoss, self).__init__()
        self.loss_fcn = loss_fcn  # 必须为 nn.BCEWithLogitsLoss() = BCE + Sigmoid
        self.gamma = gamma  # 参数γ用于削弱简单样本对loss的贡献程度
        self.alpha = alpha  # 参数α用于平衡正负样本个数不均衡的问题
        self.reduction = loss_fcn.reduction
        self.loss_fcn.reduction = 'none'

    def forward(self, pred, true):
        loss = self.loss_fcn(pred, true)  # loss = -log(p_t)

        pred_prob = torch.sigmoid(pred)
        alpha_factor = true * self.alpha + (1 - true) * (1 - self.alpha)  # α_t
        modulating_factor = torch.abs(true - pred_prob) ** self.gamma  # |y-p|^γ
        loss *= alpha_factor * modulating_factor  # 损失乘上系数

        if self.reduction == 'mean':
            return loss.mean()
        elif self.reduction == 'sum':
            return loss.sum()
        else:  # 'none'
            return loss

VariFocal Loss

??VariFocal Loss(VFL)由文章VarifocalNet: An IoU-aware Dense Object Detector提出，其与Focal Loss的主要不同为：

Focal Loss针对正样本和负样本均进行难易分类样本抑制，降低了正样本质量；VariFocal Loss仅针对负样本进行难易分类样本抑制
使用Quality Focal Loss中的软标签以及难易样本分类权重

??VariFocal Loss公式如下：

$\begin{cases} -y(ylog(p)+(1-y)log(1-p))，y>0 \\ αp^γlog(1-p)，y=0\ \end{cases}$
其中 $γ$ 用于减少易分类负样本对损失的贡献， $α$ 用于防止过度抑制。
??VariFocal Loss的实现代码如下：

class VariFocalLoss(nn.Module):
    '''
    相比Focal Loss的变化:
        仅针对负样本进行难易分类度抑制
    '''
    def __init__(self, loss_fcn, gamma=1.5, alpha=0.75):
        super(VariFocalLoss, self).__init__()
        self.loss_fcn = loss_fcn  # 必须为 nn.BCEWithLogitsLoss()=BCE+sigmoid
        self.gamma = gamma  # 参数gamma 用于负样本中削弱简单样本对loss的贡献程度
        self.alpha = alpha  # 参数alpha 用于防止对负样本的过度抑制
        self.reduction = loss_fcn.reduction
        self.loss_fcn.reduction = 'none'

    def forward(self, pred, true):
        loss = self.loss_fcn(pred, true)  # loss = -log(p_t)

        pred_prob = torch.sigmoid(pred)
        focal_weight = self.alpha * torch.abs(pred_prob - true) ** self.gamma  # 负样本系数αp^γ
        indics = torch.where(true > 0.0)  # 正样本索引
        for i in range(len(indics[0])):  # 正样本系数替换为y
            focal_weight[indics[0][i], indics[1][i]] = true[indics[0][i], indics[1][i]]
        loss *= focal_weight  # 损失乘上系数

        # 最后选择focalloss返回的类型 默认是mean
        if self.reduction == 'mean':
            return loss.mean()
        elif self.reduction == 'sum':
            return loss.sum()
        else:  # 'none'
            return loss

位置损失

IoU Loss

??IoU即交并比，用于衡量预测边框与实际边框之间的差距，其计算公式如下：
$\cap B|} \over {|A \cup B|}} [0,1]$
其中 $\cap B$ 表示边框A和B的交集面积， $\cap B$ 表示边框A和B的并集面积。
??进一步地，IoU Loss表示为：
$L_{IoU} = 1 - IoU$
??IoU计算方法如下：

def bbox_iou(box1,box2,xywh=True):
    # 将xywh坐标形式转换成xyxy(左上角右下角)坐标形式
    if xywh:  # transform from xywh to xyxy
        (x1, y1, w1, h1), (x2, y2, w2, h2) = box1.chunk(4, -1), box2.chunk(4, -1)
        w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
        b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
        b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_
     
    # 计算交集面积
    inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \
            (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)
    # 计算并集面积
    union = w1 * h1 + w2 * h2 - inter + eps
    # 计算IoU
    iou = inter / union
	return iou

GIoU Loss

??GIoU由文章Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression提出，其计算公式如下：
$\cap B|} \over {|C|}} [-1,1]$
其中C表示A和B的最小包围矩形框。
??相比于IoU，GIoU不仅考虑重叠区域，也考虑非重叠区域，能更好的反映两者的重合度。
??进一步地，IoU Loss表示为：
$L_{GIoU} = 1 - GIoU$
??GIoU计算方法如下：

def bbox_giou(box1,box2,xywh=True):
    # 将xywh坐标形式转换成xyxy(左上角右下角)坐标形式
    if xywh:  # transform from xywh to xyxy
        (x1, y1, w1, h1), (x2, y2, w2, h2) = box1.chunk(4, -1), box2.chunk(4, -1)
        w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
        b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
        b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_


    # 计算交集面积
    inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \
            (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)

    # 计算并集面积
    union = w1 * h1 + w2 * h2 - inter + eps

    # 计算IoU
    iou = inter / union
    cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)  # 最小包围矩形框宽度
    ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)  # 最小包围矩形框高度
    c_area = cw * ch + eps  # 最小包围矩形框面积
    giou = iou - (c_area - union) / c_area
    return giou

DIoU Loss

??DIoU由文章Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression提出，其计算公式如下：
${{{\rho ^2}(b,{b^{gt}})} \over {{c^2}}}$
其中 $b$ , $b^{gt}$ 分别表示预测框和实际框的中心点坐标， $ρ$ 表示两者的欧式距离， $c$ 表示最小包围矩形框的对角线长度。
??相比于IoU和GIoU，DIoU不仅考虑两者之间的重合度，还考虑两者之间的距离。
??进一步地，DIoU Loss表示为：
$L_{DIoU} = 1 - DIoU$
??DIoU计算方法如下：

def bbox_diou(box1, box2, xywh=True):
    '''
    计算Iou/GIou/DIou/CIou
    '''
    # Returns Intersection over Union (IoU) of box1(1,4) to box2(n,4)

    # 将xywh坐标形式转换成xyxy(左上角右下角)坐标形式
    if xywh:  # transform from xywh to xyxy
        (x1, y1, w1, h1), (x2, y2, w2, h2) = box1.chunk(4, -1), box2.chunk(4, -1)
        w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
        b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
        b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_


    # 计算交集面积
    inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \
            (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)

    # 计算并集面积
    union = w1 * h1 + w2 * h2 - inter + eps
    # 计算IoU
    iou = inter / union	
    
    cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)  # 最小包围矩形框宽度
    ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)  # 最小包围矩形框高度
    c2 = cw ** 2 + ch ** 2  # 最小包围矩形框对角线长度的平方
    rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2) ** 2 + (b2_y1 + b2_y2 - b1_y1 - b1_y2) ** 2) / 4  # 矩形框中心点距离的平方
    diou = iou - rho2 / c2

    return diou

CIOU Loss

??CIOU也由DIOU作者在同一篇论文中提出，在DIOU基础上，CIOU考虑矩形框之间的高宽比，其计算公式如下：
$({{{\rho ^2}(b,{b^{gt}})} \over {{c^2}}}+αv)$

$\over {{\pi ^2}}}{{(\arctan {{{w^{gt}}} \over {{h^{gt}}}} - \arctan {w \over h})}^2}}$

${\alpha = {v \over {(1 - IoU) + v}}}$

其中 $w^{gt},h^{gt})$ ， $(w, h)$ 分别表示实际边框与预测边框的宽和高。
??进一步地，CIoU Loss表示为：
$L_{DIoU} = 1 - CIoU$
??CIoU计算方法如下：

def bbox_ciou(box1, box2, xywh=True):
    '''
    计算Iou/GIou/DIou/CIou
    '''
    # Returns Intersection over Union (IoU) of box1(1,4) to box2(n,4)

    # 将xywh坐标形式转换成xyxy(左上角右下角)坐标形式
    if xywh:  # transform from xywh to xyxy
        (x1, y1, w1, h1), (x2, y2, w2, h2) = box1.chunk(4, -1), box2.chunk(4, -1)
        w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
        b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
        b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_


    # 计算交集面积
    inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \
            (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)

    # 计算并集面积
    union = w1 * h1 + w2 * h2 - inter + eps
    # 计算IoU
    iou = inter / union	
    
    cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)  # 最小包围矩形框宽度
    ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)  # 最小包围矩形框高度
    c2 = cw ** 2 + ch ** 2  # 最小包围矩形框对角线长度的平方
    rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2) ** 2 + (b2_y1 + b2_y2 - b1_y1 - b1_y2) ** 2) / 4  # 矩形框中心点距离的平方
    v = (4 / math.pi ** 2) * torch.pow(torch.atan(w2 / h2) - torch.atan(w1 / h1), 2)
	alpha = v / (v - iou + (1 + 1e-7))
	ciou - (rho2 / c2 + v * alpha)
    return ciou

Distribution Focal Loss

??Distribution Focal Loss(DFL)由QFL作者在同一篇文章Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection中提出，针对目标受遮挡时边界模糊问题，其将目标检测中边框回归任务 $(l e f t, t o p, r i g h t, b o tt o m)$ 转换为分类任务，将标签预测转换为序列预测，预测信息变化如图1所示(图来源于DFL论文)，序列面积之和即为所需结果。假设某目标 $l e f t$ 预测结果为序列 $\{y_0,y_1,y_2,...,y_{n-1}\},y_i\subseteq [0,1.0]$ ，则转换为：
$\sum\limits_{i = 0}^{n-1} {i{y_i}}$
在这里插入图片描述

a 标签预测

在这里插入图片描述

b 序列预测图1 边框信息分布预测 ??不同的序列分布可能得到相同的预测结果，如图2(图来源于DFL论文)所示，希望得到中(3)的分布序列，中间值尽可能靠近实际标签，为实现该序列的预测，作者提出DFL损失。

在这里插入图片描述

图2 不同的分布情况

??Distribution Focal Loss公式如下：
$DFL({y_i},{y_{i + 1}}) = - (i + 1- y)\log ({y_i}) - (y - i)\log ({y_{i + 1}})$
其中 $y$ 为实际标签，且式中变量满足 $i≤y≤i+1(i\subseteq N^+)$ ， $y_i,y_{i+1}\subseteq [0,1.0]$

??Distribution Focal Loss的实现方法如下：

class DistributionFocalLoss(nn.Module):
    '''
	将目标边框回归预测转换为序列分类任务
    '''
    def __init__(self, reg_max):
        super(DistributionFocalLoss, self).__init__()
		self.reg_max = reg_max # 预测序列点数

    def forward(self, pred, true):
    	'''
    	pred: [num_points, reg_max]
    	true: [num_gt, 4] 4->(ltrb)(输入图像绝对坐标)
    	'''
  		tl = true.long()  # target left(i)
  		tr = tl + 1  # target right (i+1)
  		wl = tr - true  # i+1-y
  		wr = 1 - wl # y-i
  		# -(i+1-y)log(y_i) - (y-i)log(y_i+1)
  		dfl = (F.cross_entropy(pred, tl.view(-1), reduction='none').view(tl.shape) * wl  + 
  			   F.cross_entropy(pred, tr.view(-1), reduction='none').view(tl.shape) * wr).mean(-1,keepdim=True)

文章来源:https://blog.csdn.net/qq_43676259/article/details/135644084
本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若内容造成侵权/违法违规/事实不符，请联系我的编程经验分享网邮箱：chenni525@qq.com进行投诉反馈，一经查实，立即删除！