DSOD (Deeply Supervised Object Detector) 从零开始训练自己的数据------- 无ImageNet预训练权重

0. 写作目的

好记性不如烂笔头。

环境说明： ubuntu14 + CUDA8.0 + CUDNN5.0 + GPU (k40)。4G的显卡对于训练和测试均不能运行（在GTX1050Ti上测试过）。

1. 下载安装DSOD
1) 首先安装SSD

由于DSOD是基于SSD修改的，因此需要先安装SSD。SSD原版是基于Caffe的，具体安装方式可以参考我的博客——Caffe 安装 ubuntu14 + CUDA8.0 + CUDNN5.0（基于anaconda的环境）。

2) 安装DSOD

DSOD的安装较为简单。可以参见官方给出的安装方式。

firstly, git clone https://github.com/szq0214/DSOD.git in your SSD directions.
Then:
i) Create a subfolder dsod under example/, add files DSOD300_pascal.py, 
   DSOD300_pascal++.py, DSOD300_coco.py, score_DSOD300_pascal.py and 
   DSOD300_detection_demo.py to the folder example/dsod/.

ii) Create a subfolder grp_dsod under example/, add files GRP_DSOD320_pascal.py and score_GRP_DSOD320_pascal.py to the folder example/grp_dsod/.

iii) Replace the file model_libs.py in the folder python/caffe/ with ours.

2. 训练自己的数据集

注意： 为便于叙述，假设DSOD的路径为/home/XXX/DSOD/caffe。

数据的路径为/home/XXX/Data

1) 制作自己的训练集

为了便于在不同框架中运行，这里将数据集制作为类VOC形式（VOC-LIKE）。具体制作过程不详细描述了。

2) 训练前的准备

将caffe中data/路径下的VOC0712复制一份，并修改为自己的数据集名称，如myData

i) 修改create_list.sh来创建训练和测试使用的数据列表,创建成功后在data/myData路径下生成trainval.txt test.txt和test_name_size.txt. 这里提供一份修改后的文件.

#!/bin/bash

root_dir=$HOME/Data
sub_dir=ImageSets/Main
 
bash_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
echo $bash_dir
for dataset in trainval test
do
  dst_file=$bash_dir/$dataset.txt
  if [ -f $dst_file ]
  then
    rm -f $dst_file
  fi
  for name in myData                #### your Data
  do
    echo "Create list for $name $dataset..."
    dataset_file=$root_dir/$name/$sub_dir/$dataset.txt

    img_file=$bash_dir/$dataset"_img.txt"
    # copy dataset_file to img_file and rename 
    cp $dataset_file $img_file
    sed -i "s/^/$name\/JPEGImages\//g" $img_file
    sed -i "s/$/.jpg/g" $img_file

    label_file=$bash_dir/$dataset"_label.txt"
    cp $dataset_file $label_file
    sed -i "s/^/$name\/Annotations\//g" $label_file
    sed -i "s/$/.xml/g" $label_file

    paste -d' ' $img_file $label_file >> $dst_file

    rm -f $label_file
    rm -f $img_file
  done

  # Generate image name and size infomation.
  if [ $dataset == "test" ]
  then
    $bash_dir/../../build/tools/get_image_size $root_dir $dst_file $bash_dir/$dataset"_name_size.txt"
  fi

  # Shuffle trainval file.
  if [ $dataset == "trainval" ]
  then
    rand_file=$dst_file.random
    cat $dst_file | perl -MList::Util=shuffle -e 'print shuffle(<STDIN>);' > $rand_file
    mv $rand_file $dst_file
  fi
done

ii) 修改labelmap_voc.prototxt

修改文件中的类别信息,其中0类为background类,不需要修改.

iii) 修改create_data.sh来创建caffe训练和测试使用的数据(LMDB格式)

创建成功后在/home/XXX/Data/myData路径下生成LMDB文件(lmdb,其中包含myData_train_lmdb和myData_test_lmdb).同时会在caffe的examples/myData下创建两个软链接,链接至myData_train_lmdb和myData_test_lmdb.

iv) 修改训练/测试文件

将caffe/examples/dsod/中的DSOD300_pascal.py score_DSOD300_pascal.py 和DSOD300_detection_demo.py拷贝至caffe/examples/myData下,修改DSOD300_pascal.py中的内容进行训练.

DSOD300_pascal.py主要的修改内容有:

# The database file for training data. Created by data/VOC0712/create_data.sh
train_data = "/home/XXX/Data/myData/lmdb/myData_trainval_lmdb"
# The database file for testing data. Created by data/VOC0712/create_data.sh
test_data = "/home/XXX/Data/myData/lmdb/myData_test_lmdb"

# The name of the model. Modify it if you want.
# DSOD300_Missile_DSOD300_300X300
model_name = "DSOD300_myData_{}".format(job_name)

# Directory which stores the model .prototxt file.
# models/DSOD300/myData/DSOD300_300X300
save_dir = "models/DSOD300/myData/{}".format(job_name)
# Directory which stores the snapshot of models.
snapshot_dir = "models/DSOD300/myData/{}".format(job_name)
# Directory which stores the job script and log file.
job_dir = "jobs/DSOD300/myData/{}".format(job_name)
# Directory which stores the detection results.
output_result_dir = "{}/Data/myData/results/{}/Main".format(os.environ['HOME'], job_name)

# Stores the test image names and sizes. Created by data/VOC0712/create_list.sh
name_size_file = "data/myData/test_name_size.txt"

# Stores LabelMapItem.
label_map_file = "data/myData/labelmap_voc.prototxt"

# MultiBoxLoss parameters.
num_classes = 21               ## class_num + 1(background)

# Solver parameters.
# Defining which GPUs to use.

## choosing one model for your device

####################### for Multi-GPU model ############
#gpus = "0,1,2,3,4,5,6,7"  
#gpulist = gpus.split(",")
#num_gpus = len(gpulist)
########################################################



####################### for one GPU model ##############

#gpus = "0"
#gpulist = gpus.split(" ")
#num_gpus = len(gpulist)

#######################################################


####################### for CPU model #################

num_gpus = 0          ## this is CPU model

#######################################################


# Divide the mini-batch to different GPUs.
batch_size = 4             ## modify according your device(GPU or CPU)
accum_batch_size = 16      ## modify according your device(GPU or CPU)

# Evaluate on whole test set.
num_test_image = 400       ## the number of your test images
test_batch_size = 2

注意: 在train时,如果出现输出为:不能找到目标,需要修改.

solver_param = {
    # Train parameters
    'base_lr': 10 * base_lr,
    'weight_decay': 0.0005,
    'lr_policy': "multistep",
    'stepvalue': [20000, 40000, 60000, 80000, 100000],
    'gamma': 0.1,
    'momentum': 0.9,
    'iter_size': iter_size,
    'max_iter': 100000,
    'snapshot': 2000,
    'display': 20,
    'average_loss': 10,
    'type': "SGD",
    'solver_mode': solver_mode,
    'device_id': device_id,
    'debug_info': False,
    'snapshot_after_train': True,
    # Test parameters
    'test_iter': [test_iter],
    'test_interval': 2000,
    'eval_type': "detection",
    'ap_version': "11point",
    'test_initialization': False,               ## modify this to False
    }

下面可以开始训练了.(训练的参数大体为: learning rate = 0.1 ,每隔20 000 iter降低为原来的0.1. 每2 000iter一次snapshot, 最大iter为100 000, 每2 000iter test一次.具体请看代码)

3) 测试阶段

i) 修改score_DSOD300_pascal.py文件

修改的部分与训练文件DSOD300_pascal.py类似,这里不做说明.

ii) 关于单个图像的测试DSOD300_detection_demo.py

修改的部分为:

# load PASCAL VOC labels
labelmap_file = 'data/myData/labelmap_voc.prototxt'




#Load the net in the test phase for inference, and configure input preprocessing.
model_def = 'models/DSOD300/myData/DSOD300_300x300/deploy.prototxt'
model_weights = 'models/DSOD300/myData/DSOD300_300x300/DSOD300_myData_DSOD300_300x300_iter_68000.caffemodel'

#Load an image.
img = "/home/XXX/Data/myData/test/000520.jpg"
image = caffe.io.load_image(img)
plt.imshow(image)




## alternative
# you can set the threshold for your data

# Get detections with confidence higher than 0.6.
top_indices = [i for i, conf in enumerate(det_conf) if conf >= 0.6]

iii) 关于自动测试指定目录下的所有图像的说明

依据 DSOD300_detection_demo.py文件,进行修改. (这里使用opencv来读取数据)

注意: opencv读取的图像通道为BGR,需要修改为RGB的,而且caffe.io.load_image读取的图像为[0, 1]之间的数值,而opencv读取的图像为[0, 255].

主要的修改代码为:

image_Ori = cv2.imread(img)
image = image_Ori.copy()   ## just for calculating 
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = image.astype("float") / 255.



..........

##plot rectangle on Original Image
cv2.rectangle(image_Ori,  (xmin, ymin), (xmax, ymax),     (0, 255, 0),   2) 
# parameters:    image   left-UP point  right-down point color(also BGR)   color-size

cv2.putText(image_Ori, display_txt, (xmin, ymin), cv2.FONT_HERSHEY_PLAIN, 2,  (255, 0, 0), 2)
# parameters:# image    show text  show point     font      font size  font-color

...........

cv2.imwrite(tempsaveImageDir, image_Ori)

3. 论文的理解

具体请参考下次的博客.

There may be some mistakes in this blog. So, any suggestions and comments are welcome!

【Reference】

[1] https://github.com/weiliu89/caffe/tree/ssd

[2] https://github.com/szq0214/DSOD

[3] http://openaccess.thecvf.com/content_ICCV_2017/papers/Shen_DSOD_Learning_Deeply_ICCV_2017_paper.pdf