代码地址:
objectdetection_script/yolov5-CARAFE.py at master · z1069614715/objectdetection_script (github.com)
学习视频:
YOLOV7改进-轻量级上采样算子CARAFE_哔哩哔哩_bilibili
class CARAFE(nn.Module):
def __init__(self, c, k_enc=3, k_up=5, c_mid=64, scale=2):
""" The unofficial implementation of the CARAFE module.
The details are in "https://arxiv.org/abs/1905.02188".
Args:
c: The channel number of the input and the output.
c_mid: The channel number after compression.
scale: The expected upsample scale.
k_up: The size of the reassembly kernel.
k_enc: The kernel size of the encoder.
Returns:
X: The upsampled feature map.
"""
super(CARAFE, self).__init__()
self.scale = scale
self.comp = Conv(c, c_mid)
self.enc = Conv(c_mid, (scale*k_up)**2, k=k_enc, act=False)
self.pix_shf = nn.PixelShuffle(scale)
self.upsmp = nn.Upsample(scale_factor=scale, mode='nearest')
self.unfold = nn.Unfold(kernel_size=k_up, dilation=scale,
padding=k_up//2*scale)
def forward(self, X):
b, c, h, w = X.size()
h_, w_ = h * self.scale, w * self.scale
W = self.comp(X) # b * m * h * w
W = self.enc(W) # b * 100 * h * w
W = self.pix_shf(W) # b * 25 * h_ * w_
W = torch.softmax(W, dim=1) # b * 25 * h_ * w_
X = self.upsmp(X) # b * c * h_ * w_
X = self.unfold(X) # b * 25c * h_ * w_
X = X.view(b, c, -1, h_, w_) # b * 25 * c * h_ * w_
X = torch.einsum('bkhw,bckhw->bchw', [W, X]) # b * c * h_ * w_
return X
将以上代码复制到models文件夹下的common.py文件粘贴在最下面
在yolo.py中将以下代码粘贴到红箭头指向的下一行
elif m is CARAFE:
c2 = ch[f]
args = [c2, *args]
在cfg文件夹下training文件夹下的yolov7.yaml中
在head部分,将nn.Upsample上采样都改成CARAFE
将后面的参数删掉,即[None, 2, 'nearest']改为[1, 5],下一个改为[1, 7]