在《如何使用Mask RCNN模型进行图像实体分割?》一文中提到了用Mask-RCNN来做气球分割,官网之中也有对应的代码,本着练习的态度,那么笔者就拿来这个数据集继续练手,最麻烦的仍然是如何得到标注数据。MaskRCNN的开源code为Mask R-CNN - Inspect Balloon Training Data
https://link.zhihu.com/?target=https%3A//github.com/matterport/Mask_RCNN/blob/v2.1/samples/balloon/inspect_balloon_data.ipynb
由于很多内容是从Mask R-CNN之中挖过来的,笔者也没细究,能用就行,所以会显得很笨拙…
1 训练集的准备
数据下载页面:balloon_dataset.zip
该案例更为通用,因为比赛的训练集是比赛方写好的,一般实际训练的时候,掩膜都是没有给出的,而只是给出标记点,如:
此时的标注数据都放在json之中,譬如:
{'10464445726_6f1e3bbe6a_k.jpg712154': {'base64_img_data': '',
'file_attributes': {},
'filename': '10464445726_6f1e3bbe6a_k.jpg',
'fileref': '',
'regions': {'0': {'region_attributes': {},
'shape_attributes': {'all_points_x': [1757,
1772,
1787,
1780,
1764],
'all_points_y': [867,
913,
986,
1104,
1170],
'name': 'polygon'}},
all_points_x
以及all_points_y
都是掩膜标记的(x,y)点坐标,每一个物体都是由很多个box构造而成:
def get_mask(a,dataset_dir):
image_path = os.path.join(dataset_dir, a['filename'])
image = io.imread(image_path)
height, width = image.shape[:2]
polygons = [r['shape_attributes'] for r in a['regions'].values()]
mask = np.zeros([height, width, len(polygons)],dtype=np.uint8)
# 掩膜mask
for i, p in enumerate(polygons):
# Get indexes of pixels inside the polygon and set them to 1
rr, cc = skimage.draw.polygon(p['all_points_y'], p['all_points_x'])
mask[rr, cc, i] = 1
# 此时mask为(685, 1024, 1)
# mask二值化
mask, class_ids = mask.astype(np.bool), np.ones([mask.shape[-1]], dtype=np.int32)
# 提取每个掩膜的坐标
boxes = extract_bboxes(resize(mask, (128, 128), mode='constant',preserve_range=True))
unique_class_ids = np.unique(class_ids)
mask_area = [np.sum(mask[:, :, np.where(class_ids == i)[0]])
for i in unique_class_ids]
top_ids = [v[0] for v in sorted(zip(unique_class_ids, mask_area),
key=lambda r: r[1], reverse=True) if v[1] > 0]
class_id = top_ids[0]
# Pull masks of instances belonging to the same class.
m = mask[:, :, np.where(class_ids == class_id)[0]]
m = np.sum(m * np.arange(1, m.shape[-1] + 1), -1)
return m,image,height,width,class_ids,boxes
polygon之中记录的是一个掩膜的(x,y)点坐标,然后通过skimage.draw.polygon连成圈;
mask[rr, cc, i] = 1这句中,mask变成了一个0/1的(m,n,x)的矩阵,x代表可能有x个物体;
mask.astype(np.bool)将上述的0/1矩阵,变为T/F矩阵;
extract_bboxes()函数,要着重说,因为他是根据掩膜的位置,找出整体掩膜的坐标点,给入5个物体,他就会返回5个物体的坐标(xmax,ymax,xmin,ymin)
np.sum()是降维的过程,把(m,n,1)到(m,n)
那么,最终 Y_train的数据格式如上一篇文章,一样的:
array([[[[False],
[False],
[False],
...,
[False],
[False],
[False]],
[[False],
[False],
[False],
...,
[False],
...
2 模型预测
model = load_model(model_name, custom_objects={'mean_iou': mean_iou})
preds_train = model.predict(X_train[:int(X_train.shape[0]*0.9)], verbose=1)
preds_val = model.predict(X_train[int(X_train.shape[0]*0.9):],verbose=1)
preds_test = model.predict(X_test,verbose=1)
这边的操作是把trainset按照9:1,分为训练集、验证集,还有一部分是测试集
输入维度:
X_train (670, 128, 128, 3)
Y_train (670, 128, 128, 1)
X_test (65, 128, 128, 3)
输出维度:
每个像素点的概率[0,1]
preds_train (603, 128, 128, 1)
preds_val (67, 128, 128, 1)
preds_test (65, 128, 128, 1)
3 画图函数
该部分是从MaskRCNN中搬过来的,
def display_instances(image, boxes, masks, class_names,
scores=None, title="",
figsize=(16, 16), ax=None,
show_mask=True, show_bbox=True,
colors=None, captions=None):
需要图像矩阵image,boxes代表每个实例的boxes,masks是图像的掩膜,class_names,是每张图标签的名称。下图是128*128像素的,很模糊,将就着看吧…
随机颜色生成函数random_colors
def random_colors(N, bright=True):
"""
Generate random colors.
To get visually distinct colors, generate them in HSV space then
convert to RGB.
"""
brightness = 1.0 if bright else 0.7
hsv = [(i / N, 1, brightness) for i in range(N)]
colors = list(map(lambda c: colorsys.hsv_to_rgb(*c), hsv))
random.shuffle(colors)
return colors
还有就是一般来说,掩膜如果是(m,n),或者让是(m,n,1)都是可以画出来的。
imshow(mask)
plt.show()