从工程上彻底掌握Sparse R-CNN结构

Main

请添加图片描述
请添加图片描述
用一组数量少且可学习的proposal(100个)代替来自RPN模块大量的proposal。

Method

Input

one image, proposal box (learnable), proposal features (learnable)

Backbone

FPN based on ResNet (ResNet50), Channel = 256

Learnable proposal box

Shape: [N, 4]; N means box numbers (100); 4 means 4-d parameters, including normalized center coordinates (ranging from 0 to 1), height and width

Learnable proposal feature

Notes that boxes only provide localization and some information details are lost, such as object pose and shape.
Shape: [N, d]; d means high dimension (256).
box.shape = proposal feature.shape

Dynamic instance interactive head

Before: 通过RoIAlign得到RoI feature;proposal box、proposal feature和RoI feature的关系是一一对应的。
Structure:
两个1*1卷积(ReLU激活函数):使用proposal feature,生成两个卷积的参数。使用RoI feature和两个卷积的参数,生成object feature。
Iteration structure: The newly generated object boxes and object features will serve as the proposal boxes and proposal features of next stage.
请添加图片描述

Loss

请添加图片描述
L_cls是focal loss,L_L1和L_giou分别为L1loss和GIOUloss。

猜你喜欢

转载自blog.csdn.net/qq_43114108/article/details/127173686