PyTorch 目标检测（五）

SSD实战

1. 先从跑别人的代码开始

https://github.com/amdegroot/ssd.pytorch
下载代码，markdown文件有教程，但是代码有一些问题

问题汇总

用的是VOC2012训练集

删除voc0712.py __init__函数中imageset参数里07的信息

运行Train.py

RuntimeError: CUDA out of memory. Tried to allocate 176.00 MiB (GPU 0; 2.00 GiB total capacity; 1.23 GiB already allocated; 107.80 MiB free; 1.24 GiB reserved in total by PyTorch)

电脑太差，降低batch_size

IndexError: The shape of the mask [8, 8732] at index 0 does not match the shape of the indexed tensor [69856, 1] at index 0

ssd.pytorch\layers\modules\multibox_loss.py中loss 维度不对应

layers/modules/multibox_loss.py 第97、98行调换

loss_c = loss_c.view(num, -1)
loss_c[pos] = 0  # filter out pos boxes for now

第114行将 N = num_pos.data.sum() 改为

N = num_pos.data.sum().double()
loss_l = loss_l.double()
loss_c = loss_c.double()

train.py 第183、184行改为，类似报错一样处理

loc_loss += loss_l.data.item()
conf_loss += loss_c.data.item()

代码版本问题根据提示修改即可

训练出现loss为nan

降低学习率

StopIteration

使用next()到尽头后会报错，重置迭代器即可

        try:
            images, targets = next(batch_iterator)
        except StopIteration as e:
            batch_iterator = iter(data_loader)
            images, targets = next(batch_iterator)

2.详解SSD模型

数据预处理

SSD在数据增强上做了非常丰富的处理，从而提高了小物体和遮挡物体的检测效果。它的流程可以分为光学变换和几何变换，光学变换不会改变图片的大小，几何变换主要进行尺寸上的变换，最后再进行取均值操作，大部分操作都是随机的。
在这里插入图片描述
数据增强的流程代码在augmentations.py中

class SSDAugmentation(object):
    def __init__(self, size=300, mean=(104, 117, 123)):
        self.mean = mean
        self.size = size
        self.augment = Compose([
            ConvertFromInts(), 将像素值由整数变为浮点数
            ToAbsoluteCoords(), 将标签中的边框的比例坐标变为绝对坐标
            PhotometricDistort(), 亮度、对比度、饱和度的随机变换，随机调换通道
            Expand(self.mean), 随机扩展图像大小，图像考右下方
            RandomSampleCrop(), 随机裁剪图像
            RandomMirror(), 随机左右镜像
            ToPercentCoords(), 从真实坐标变回比例坐标
            Resize(self.size), 缩放到300x300的固定大小
            SubtractMeans(self.mean) 取均值
        ])

    def __call__(self, img, boxes, labels):
        return self.augment(img, boxes, labels)

举例

光学变换中的亮度调整

class RandomBrightness(object):
 def __init__(self, delta=32):
     assert delta >= 0.0
     assert delta <= 255.0
     self.delta = delta

 def __call__(self, image, boxes=None, labels=None):
     if random.randint(2):
         delta = random.uniform(-self.delta, self.delta)
         image += delta
     return image, boxes, labels

以0.5的概率为图像中的每个元素加一个位于[-32,32)区间内的数

光学变换添加随机光照噪声

class RandomLightingNoise(object):
 def __init__(self):
     self.perms = ((0, 1, 2), (0, 2, 1),
                   (1, 0, 2), (1, 2, 0),
                   (2, 0, 1), (2, 1, 0))

 def __call__(self, image, boxes=None, labels=None):
     if random.randint(2):
         swap = self.perms[random.randint(len(self.perms))]
         shuffle = SwapChannels(swap)  # shuffle channels
         image = shuffle(image)
     return image, boxes, labels

几何变换中的尺度随机扩展

扩展的具体过程是随机选择一个在[1,4)区间的数作为扩展的比例，将原图像放在扩展后图像的右下角，其他区域填入每个通道的均值

class Expand(object):
   def __init__(self, mean):
       self.mean = mean

   def __call__(self, image, boxes, labels):
       if random.randint(2):
           return image, boxes, labels

       height, width, depth = image.shape
       ratio = random.uniform(1, 4)
       left = random.uniform(0, width*ratio - width)
       top = random.uniform(0, height*ratio - height)

       expand_image = np.zeros(
           (int(height*ratio), int(width*ratio), depth),
           dtype=image.dtype)
       expand_image[:, :, :] = self.mean
       expand_image[int(top):int(top + height),
                    int(left):int(left + width)] = image
       image = expand_image

       boxes = boxes.copy()
       boxes[:, :2] += (int(left), int(top))
       boxes[:, 2:] += (int(left), int(top))

       return image, boxes, labels

几何变换中的尺度随机裁剪

裁剪需要保证至少与一个物体边框有重叠，重叠比例从[0.1, 0.3, 0.7, 0.9]随机选取，且至少包括一个物体的中心点。随机裁剪可以增强遮挡物体的检测效果。

czkjmohzy

发布了25 篇原创文章 · 获赞 2 · 访问量 2105

私信关注