数据倍化术——提升数据数量同时满足小目标图像构建

前言

  根据西瓜书上所述:数据和特征决定了机器学习的上限,而模型和算法只是逼近这个上限而已 ,因此在进行机器学习上,我们往往需要大量的优质数据集,为此我们尝试了各种数据扩增手段,例如:

  1. 增加或降低图像亮度
  2. 增加噪声或滤除噪声
  3. 图像进行镜像处理

......

  那我们应当如何高效利用已有的数据集进行数据的快速扩增(图像数据+Xml数据),正对上述需求,我们可以分为两大块项目进行处理:图像的缩放拼接、Xml文件解析并重编为此在本篇博客中提出一种缩小拼接法快速扩增数据集用来提高数据数量和质量。(针对Voc型数据集)

图像缩小拼接

  例如:我们在拿到一张像素大小为1920x1080的图像A.jpg,我们需要将图像A缩小符合3x3的样式,因此需要对图像A进行缩小3倍处理,同时需要Copy9份图像,然后再对这9份图像进行拼接成原图像A大小的新图像。流程如下:

graph TD
读取图像 --> 计算宽高
计算宽高 --> 宽高缩小3倍
宽高缩小3倍 --> Copy9份
Copy9份 --> 集合9份拼接
复制代码

未标注原图像样例: s.jpg 标注原图像样例: image.png

  通过标注原图像样例种我们可以看出如下标记框和编号和与之配套的xml文件如下
label1 : 眼睛
label2 : 鼻子
label3 : 腮红
label4 :嘴巴

<annotation>
   <folder>11</folder>
   <filename>s.jpg</filename>
   <path>C:\Users\kiven\Desktop\11\s.jpg</path>
   <source>
      <database>Unknown</database>
   </source>
   <size>
      <width>1920</width>
      <height>1080</height>
      <depth>3</depth>
   </size>
   <segmented>0</segmented>
   <object>
      <name>1</name>
      <pose>Unspecified</pose>
      <truncated>0</truncated>
      <difficult>0</difficult>
      <bndbox>
         <xmin>365</xmin>
         <ymin>467</ymin>
         <xmax>676</xmax>
         <ymax>660</ymax>
      </bndbox>
   </object>
   <object>
      <name>1</name>
      <pose>Unspecified</pose>
      <truncated>0</truncated>
      <difficult>0</difficult>
      <bndbox>
         <xmin>1221</xmin>
         <ymin>414</ymin>
         <xmax>1530</xmax>
         <ymax>610</ymax>
      </bndbox>
   </object>
   <object>
      <name>2</name>
      <pose>Unspecified</pose>
      <truncated>0</truncated>
      <difficult>0</difficult>
      <bndbox>
         <xmin>938</xmin>
         <ymin>714</ymin>
         <xmax>1055</xmax>
         <ymax>769</ymax>
      </bndbox>
   </object>
   <object>
      <name>3</name>
      <pose>Unspecified</pose>
      <truncated>0</truncated>
      <difficult>0</difficult>
      <bndbox>
         <xmin>149</xmin>
         <ymin>819</ymin>
         <xmax>535</xmax>
         <ymax>964</ymax>
      </bndbox>
   </object>
   <object>
      <name>3</name>
      <pose>Unspecified</pose>
      <truncated>0</truncated>
      <difficult>0</difficult>
      <bndbox>
         <xmin>1401</xmin>
         <ymin>697</ymin>
         <xmax>1785</xmax>
         <ymax>848</ymax>
      </bndbox>
   </object>
   <object>
      <name>4</name>
      <pose>Unspecified</pose>
      <truncated>0</truncated>
      <difficult>0</difficult>
      <bndbox>
         <xmin>782</xmin>
         <ymin>905</ymin>
         <xmax>1226</xmax>
         <ymax>1042</ymax>
      </bndbox>
   </object>
</annotation>

缩小拼接后图像: s.jpg

Xml解析重构

  通过对原图像对应的xml文件进行解析我们不难得出:需要生成的新Xml只有object部分进行更改,其他部分保持一致即可。在进行object部分修改的时需要注意如下问题:

  1. 在原图种的每个object的name存在一个或多个,那么在缩小拼接后的图像种name的要和坐标对应好;
  2. pose、truncated和difficult 部分的参数都一样,故而不需要进行更改;
  3. 每个name对应的坐标在进行缩放拼接后对应的值需要转化为整数型计算,在返回时也需要转为整数型返回

程序逻辑:

graph TD
读取标签文件 --> 解析参数
解析参数 --> 获取非object部分参数
解析参数 --> 获取object部分参数
获取非object部分参数 --> 生成xml并填充
获取object部分参数 --> 计算name以及对应的坐标
计算name以及对应的坐标 --> 生成xml并填充
复制代码

经过缩小变换后的标注图如下所示: image.png    我们可以发现在每张小图上把眼睛鼻子腮红嘴巴都标注出来了,没有落下的,检查LabelImg发现也没有错误,编号都一一对应好了,这次的扩充数据算是圆满完成了。

扩增后的Xml文件:

<annotation>
    <folder>11</folder>
    <filename>s.jpg</filename>
    <path>C:\Users\kiven\Desktop\11\s.jpg</path>
    <size>
        <width>1920</width>
        <height>1080</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>121</xmin>
            <ymin>155</ymin>
            <xmax>225</xmax>
            <ymax>220</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>761</xmin>
            <ymin>155</ymin>
            <xmax>865</xmax>
            <ymax>220</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1401</xmin>
            <ymin>155</ymin>
            <xmax>1505</xmax>
            <ymax>220</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>121</xmin>
            <ymin>515</ymin>
            <xmax>225</xmax>
            <ymax>580</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>761</xmin>
            <ymin>515</ymin>
            <xmax>865</xmax>
            <ymax>580</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1401</xmin>
            <ymin>515</ymin>
            <xmax>1505</xmax>
            <ymax>580</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>121</xmin>
            <ymin>875</ymin>
            <xmax>225</xmax>
            <ymax>940</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>761</xmin>
            <ymin>875</ymin>
            <xmax>865</xmax>
            <ymax>940</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1401</xmin>
            <ymin>875</ymin>
            <xmax>1505</xmax>
            <ymax>940</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>407</xmin>
            <ymin>138</ymin>
            <xmax>510</xmax>
            <ymax>203</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1047</xmin>
            <ymin>138</ymin>
            <xmax>1150</xmax>
            <ymax>203</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1687</xmin>
            <ymin>138</ymin>
            <xmax>1790</xmax>
            <ymax>203</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>407</xmin>
            <ymin>498</ymin>
            <xmax>510</xmax>
            <ymax>563</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1047</xmin>
            <ymin>498</ymin>
            <xmax>1150</xmax>
            <ymax>563</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1687</xmin>
            <ymin>498</ymin>
            <xmax>1790</xmax>
            <ymax>563</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>407</xmin>
            <ymin>858</ymin>
            <xmax>510</xmax>
            <ymax>923</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1047</xmin>
            <ymin>858</ymin>
            <xmax>1150</xmax>
            <ymax>923</ymax>
        </bndbox>
    </object>
    <object>
        <name>1</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1687</xmin>
            <ymin>858</ymin>
            <xmax>1790</xmax>
            <ymax>923</ymax>
        </bndbox>
    </object>
    <object>
        <name>2</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>312</xmin>
            <ymin>238</ymin>
            <xmax>351</xmax>
            <ymax>256</ymax>
        </bndbox>
    </object>
    <object>
        <name>2</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>952</xmin>
            <ymin>238</ymin>
            <xmax>991</xmax>
            <ymax>256</ymax>
        </bndbox>
    </object>
    <object>
        <name>2</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1592</xmin>
            <ymin>238</ymin>
            <xmax>1631</xmax>
            <ymax>256</ymax>
        </bndbox>
    </object>
    <object>
        <name>2</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>312</xmin>
            <ymin>598</ymin>
            <xmax>351</xmax>
            <ymax>616</ymax>
        </bndbox>
    </object>
    <object>
        <name>2</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>952</xmin>
            <ymin>598</ymin>
            <xmax>991</xmax>
            <ymax>616</ymax>
        </bndbox>
    </object>
    <object>
        <name>2</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1592</xmin>
            <ymin>598</ymin>
            <xmax>1631</xmax>
            <ymax>616</ymax>
        </bndbox>
    </object>
    <object>
        <name>2</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>312</xmin>
            <ymin>958</ymin>
            <xmax>351</xmax>
            <ymax>976</ymax>
        </bndbox>
    </object>
    <object>
        <name>2</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>952</xmin>
            <ymin>958</ymin>
            <xmax>991</xmax>
            <ymax>976</ymax>
        </bndbox>
    </object>
    <object>
        <name>2</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1592</xmin>
            <ymin>958</ymin>
            <xmax>1631</xmax>
            <ymax>976</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>49</xmin>
            <ymin>273</ymin>
            <xmax>178</xmax>
            <ymax>321</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>689</xmin>
            <ymin>273</ymin>
            <xmax>818</xmax>
            <ymax>321</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1329</xmin>
            <ymin>273</ymin>
            <xmax>1458</xmax>
            <ymax>321</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>49</xmin>
            <ymin>633</ymin>
            <xmax>178</xmax>
            <ymax>681</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>689</xmin>
            <ymin>633</ymin>
            <xmax>818</xmax>
            <ymax>681</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1329</xmin>
            <ymin>633</ymin>
            <xmax>1458</xmax>
            <ymax>681</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>49</xmin>
            <ymin>993</ymin>
            <xmax>178</xmax>
            <ymax>1041</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>689</xmin>
            <ymin>993</ymin>
            <xmax>818</xmax>
            <ymax>1041</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1329</xmin>
            <ymin>993</ymin>
            <xmax>1458</xmax>
            <ymax>1041</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>467</xmin>
            <ymin>232</ymin>
            <xmax>595</xmax>
            <ymax>282</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1107</xmin>
            <ymin>232</ymin>
            <xmax>1235</xmax>
            <ymax>282</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1747</xmin>
            <ymin>232</ymin>
            <xmax>1875</xmax>
            <ymax>282</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>467</xmin>
            <ymin>592</ymin>
            <xmax>595</xmax>
            <ymax>642</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1107</xmin>
            <ymin>592</ymin>
            <xmax>1235</xmax>
            <ymax>642</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1747</xmin>
            <ymin>592</ymin>
            <xmax>1875</xmax>
            <ymax>642</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>467</xmin>
            <ymin>952</ymin>
            <xmax>595</xmax>
            <ymax>1002</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1107</xmin>
            <ymin>952</ymin>
            <xmax>1235</xmax>
            <ymax>1002</ymax>
        </bndbox>
    </object>
    <object>
        <name>3</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1747</xmin>
            <ymin>952</ymin>
            <xmax>1875</xmax>
            <ymax>1002</ymax>
        </bndbox>
    </object>
    <object>
        <name>4</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>260</xmin>
            <ymin>301</ymin>
            <xmax>408</xmax>
            <ymax>347</ymax>
        </bndbox>
    </object>
    <object>
        <name>4</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>900</xmin>
            <ymin>301</ymin>
            <xmax>1048</xmax>
            <ymax>347</ymax>
        </bndbox>
    </object>
    <object>
        <name>4</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1540</xmin>
            <ymin>301</ymin>
            <xmax>1688</xmax>
            <ymax>347</ymax>
        </bndbox>
    </object>
    <object>
        <name>4</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>260</xmin>
            <ymin>661</ymin>
            <xmax>408</xmax>
            <ymax>707</ymax>
        </bndbox>
    </object>
    <object>
        <name>4</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>900</xmin>
            <ymin>661</ymin>
            <xmax>1048</xmax>
            <ymax>707</ymax>
        </bndbox>
    </object>
    <object>
        <name>4</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1540</xmin>
            <ymin>661</ymin>
            <xmax>1688</xmax>
            <ymax>707</ymax>
        </bndbox>
    </object>
    <object>
        <name>4</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>260</xmin>
            <ymin>1021</ymin>
            <xmax>408</xmax>
            <ymax>1067</ymax>
        </bndbox>
    </object>
    <object>
        <name>4</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>900</xmin>
            <ymin>1021</ymin>
            <xmax>1048</xmax>
            <ymax>1067</ymax>
        </bndbox>
    </object>
    <object>
        <name>4</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1540</xmin>
            <ymin>1021</ymin>
            <xmax>1688</xmax>
            <ymax>1067</ymax>
        </bndbox>
    </object>
</annotation>

回顾总结

   通过对图像的缩小拼接,我们可以快速完成数据集的扩增,与此同时,我们也得到了较小的目标,为小目标检测打下了数据基础。

猜你喜欢

转载自juejin.im/post/7116819375008514061