准备Yolo数据集的时候,使用工具标注出来坐标可能是归一化之后的坐标,如果想要得到图片上的原始坐标则需要通过公式转化,下面我详细分析一下图片原始坐标和归一化坐标之间的关系,之后不论需要哪种坐标都能很轻松的转换。
定义原始图片中:
宽高为(w,h) bounding box(xmin,ymin,xmax,ymax)为左上角和右下角两个点的坐标
归一化后图片中:
宽高为(w1,h1) 中心点坐标(x,y)
有如下归一化公式:
根据上述公式联立解方程即可反推出归一化之前的数值:
xmin = int(wide * (x - (wp / 2.0)))
ymin = int(height * (y - (hp / 2.0)))
xmax = int(wide * (x + (wp / 2.0)))
ymax = int(height * (y + (hp / 2.0)))
接下来再说txt文本转化成xml文件的问题
这是我txt标记文本的几行数据
标注数据格式为(filename label x_min y_min x_max y_max)
slipper0-1000.jpg slipper 6 23 136 101
slipper0-112.jpg slipper 580 117 1118 922
slipper0-112.jpg slipper 99 155 618 985
sock0-1.jpg sock 1757 505 2957 1599
sock0-10.jpg sock 934 1100 2010 1368
wire0-1.jpg wire 1173 970 2258 3024
wire0-10.jpg wire 417 844 1726 1249
需要转化成xml格式
<annotation verified="yes">
<folder>train_small</folder>
<filename>slipper0-113.jpg</filename>
<path>/home/hesongze/PycharmProjects/keras-yolo2-master/raccoon_dataset-master/train_small/slipper0-113.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>1280</width>
<height>1280</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>slipper</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>79</xmin>
<ymin>351</ymin>
<xmax>1279</xmax>
<ymax>961</ymax>
</bndbox>
</object>
<object>
<name>slipper</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>79</xmin>
<ymin>131</ymin>
<xmax>1279</xmax>
<ymax>518</ymax>
</bndbox>
</object>
</annotation>
我们需要将这个xml保存下来作为转换模板,anno.xml,后面会用到
最后将下列程序的路径设置好即可运行生成最终label的xml格式,实现txt格式转化为xml格式:
这个方法解决了数据中的多框问题,基本上关键点都做了相应注释,最终的xml文件会生成在Annotations文件夹中
import copy
from lxml.etree import Element, SubElement, tostring, ElementTree
import cv2
# 修改为自己的路径
template_file = '/home/hesongze/PycharmProjects/keras-yolo2-master/user_data/anno.xml'
target_dir = '/home/hesongze/PycharmProjects/keras-yolo2-master/user_data/Annotations/'
image_dir = '/home/hesongze/PycharmProjects/keras-yolo2-master/user_data/train/' # 图片文件夹
train_file = '/home/hesongze/PycharmProjects/keras-yolo2-master/user_data/data_xxx.txt' # 存储了图片信息的txt文件
path = '/home/hesongze/PycharmProjects/keras-yolo2-master/user_data/train/'
with open(train_file) as f:
trainfiles = f.readlines() # 标注数据 格式(filename label x_min y_min x_max y_max)
file_names = []
tree = ElementTree()
for line in trainfiles:
trainFile = line.split()
file_name = trainFile[0]
print(file_name)
# 如果没有重复,则顺利进行。这给的数据集一张图片的多个框没有写在一起。
if file_name not in file_names:
file_names.append(file_name)
lable = trainFile[1]
xmin = trainFile[2]
ymin = trainFile[3]
xmax = trainFile[4]
ymax = trainFile[5]
tree.parse(template_file)
root = tree.getroot()
root.find('filename').text = file_name
#path
root.find('path').text = path + file_name
# size
sz = root.find('size')
im = cv2.imread(image_dir + file_name) # 读取图片信息
sz.find('height').text = str(im.shape[0])
sz.find('width').text = str(im.shape[1])
sz.find('depth').text = str(im.shape[2])
# object 因为我的数据集都只有一个框
obj = root.find('object')
obj.find('name').text = lable
bb = obj.find('bndbox')
bb.find('xmin').text = xmin
bb.find('ymin').text = ymin
bb.find('xmax').text = xmax
bb.find('ymax').text = ymax
# 如果重复,则需要添加object框
else:
lable = trainFile[1]
xmin = trainFile[2]
ymin = trainFile[3]
xmax = trainFile[4]
ymax = trainFile[5]
xml_file = file_name.replace('jpg', 'xml')
tree.parse(target_dir + xml_file) # 如果已经重复
root = tree.getroot()
obj_ori = root.find('object')
obj = copy.deepcopy(obj_ori) # 注意这里深拷贝
obj.find('name').text = lable
bb = obj.find('bndbox')
bb.find('xmin').text = xmin
bb.find('ymin').text = ymin
bb.find('xmax').text = xmax
bb.find('ymax').text = ymax
root.append(obj)
xml_file = file_name.replace('jpg', 'xml')
tree.write(target_dir + xml_file, encoding='utf-8')