-
0. 写作目的
好记性不如烂笔头。
环境说明: ubuntu14 + CUDA8.0 + CUDNN5.0 + GPU (k40)。4G的显卡对于训练和测试均不能运行(在GTX1050Ti上测试过)。
-
1. 下载安装DSOD
-
1) 首先安装SSD
由于DSOD是基于SSD修改的,因此需要先安装SSD。SSD原版是基于Caffe的,具体安装方式可以参考我的博客——Caffe 安装 ubuntu14 + CUDA8.0 + CUDNN5.0(基于anaconda的环境)。
-
2) 安装DSOD
DSOD的安装较为简单。可以参见官方给出的安装方式。
firstly, git clone https://github.com/szq0214/DSOD.git in your SSD directions.
Then:
i) Create a subfolder dsod under example/, add files DSOD300_pascal.py,
DSOD300_pascal++.py, DSOD300_coco.py, score_DSOD300_pascal.py and
DSOD300_detection_demo.py to the folder example/dsod/.
ii) Create a subfolder grp_dsod under example/, add files GRP_DSOD320_pascal.py and score_GRP_DSOD320_pascal.py to the folder example/grp_dsod/.
iii) Replace the file model_libs.py in the folder python/caffe/ with ours.
-
2. 训练自己的数据集
注意: 为便于叙述,假设DSOD的路径为/home/XXX/DSOD/caffe。
数据的路径为/home/XXX/Data
-
1) 制作自己的训练集
为了便于在不同框架中运行,这里将数据集制作为类VOC形式(VOC-LIKE)。具体制作过程不详细描述了。
-
2) 训练前的准备
将caffe中data/路径下的VOC0712复制一份,并修改为自己的数据集名称,如myData
i) 修改create_list.sh来创建训练和测试使用的数据列表,创建成功后在data/myData路径下生成trainval.txt test.txt和test_name_size.txt. 这里提供一份修改后的文件.
#!/bin/bash
root_dir=$HOME/Data
sub_dir=ImageSets/Main
bash_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
echo $bash_dir
for dataset in trainval test
do
dst_file=$bash_dir/$dataset.txt
if [ -f $dst_file ]
then
rm -f $dst_file
fi
for name in myData #### your Data
do
echo "Create list for $name $dataset..."
dataset_file=$root_dir/$name/$sub_dir/$dataset.txt
img_file=$bash_dir/$dataset"_img.txt"
# copy dataset_file to img_file and rename
cp $dataset_file $img_file
sed -i "s/^/$name\/JPEGImages\//g" $img_file
sed -i "s/$/.jpg/g" $img_file
label_file=$bash_dir/$dataset"_label.txt"
cp $dataset_file $label_file
sed -i "s/^/$name\/Annotations\//g" $label_file
sed -i "s/$/.xml/g" $label_file
paste -d' ' $img_file $label_file >> $dst_file
rm -f $label_file
rm -f $img_file
done
# Generate image name and size infomation.
if [ $dataset == "test" ]
then
$bash_dir/../../build/tools/get_image_size $root_dir $dst_file $bash_dir/$dataset"_name_size.txt"
fi
# Shuffle trainval file.
if [ $dataset == "trainval" ]
then
rand_file=$dst_file.random
cat $dst_file | perl -MList::Util=shuffle -e 'print shuffle(<STDIN>);' > $rand_file
mv $rand_file $dst_file
fi
done
ii) 修改labelmap_voc.prototxt
修改文件中的类别信息,其中0类为background类,不需要修改.
iii) 修改create_data.sh来创建caffe训练和测试使用的数据(LMDB格式)
创建成功后在/home/XXX/Data/myData路径下生成LMDB文件(lmdb,其中包含myData_train_lmdb和myData_test_lmdb).同时会在caffe的examples/myData下创建两个软链接,链接至myData_train_lmdb和myData_test_lmdb.
iv) 修改训练/测试文件
将caffe/examples/dsod/中的DSOD300_pascal.py score_DSOD300_pascal.py 和DSOD300_detection_demo.py拷贝至caffe/examples/myData下,修改DSOD300_pascal.py中的内容进行训练.
DSOD300_pascal.py主要的修改内容有:
-
# The database file for training data. Created by data/VOC0712/create_data.sh train_data = "/home/XXX/Data/myData/lmdb/myData_trainval_lmdb" # The database file for testing data. Created by data/VOC0712/create_data.sh test_data = "/home/XXX/Data/myData/lmdb/myData_test_lmdb"
-
# The name of the model. Modify it if you want. # DSOD300_Missile_DSOD300_300X300 model_name = "DSOD300_myData_{}".format(job_name) # Directory which stores the model .prototxt file. # models/DSOD300/myData/DSOD300_300X300 save_dir = "models/DSOD300/myData/{}".format(job_name) # Directory which stores the snapshot of models. snapshot_dir = "models/DSOD300/myData/{}".format(job_name) # Directory which stores the job script and log file. job_dir = "jobs/DSOD300/myData/{}".format(job_name) # Directory which stores the detection results. output_result_dir = "{}/Data/myData/results/{}/Main".format(os.environ['HOME'], job_name)
-
# Stores the test image names and sizes. Created by data/VOC0712/create_list.sh name_size_file = "data/myData/test_name_size.txt" # Stores LabelMapItem. label_map_file = "data/myData/labelmap_voc.prototxt" # MultiBoxLoss parameters. num_classes = 21 ## class_num + 1(background)
-
# Solver parameters. # Defining which GPUs to use. ## choosing one model for your device ####################### for Multi-GPU model ############ #gpus = "0,1,2,3,4,5,6,7" #gpulist = gpus.split(",") #num_gpus = len(gpulist) ######################################################## ####################### for one GPU model ############## #gpus = "0" #gpulist = gpus.split(" ") #num_gpus = len(gpulist) ####################################################### ####################### for CPU model ################# num_gpus = 0 ## this is CPU model ####################################################### # Divide the mini-batch to different GPUs. batch_size = 4 ## modify according your device(GPU or CPU) accum_batch_size = 16 ## modify according your device(GPU or CPU)
-
# Evaluate on whole test set. num_test_image = 400 ## the number of your test images test_batch_size = 2
注意: 在train时,如果出现输出为:不能找到目标,需要修改.
-
solver_param = { # Train parameters 'base_lr': 10 * base_lr, 'weight_decay': 0.0005, 'lr_policy': "multistep", 'stepvalue': [20000, 40000, 60000, 80000, 100000], 'gamma': 0.1, 'momentum': 0.9, 'iter_size': iter_size, 'max_iter': 100000, 'snapshot': 2000, 'display': 20, 'average_loss': 10, 'type': "SGD", 'solver_mode': solver_mode, 'device_id': device_id, 'debug_info': False, 'snapshot_after_train': True, # Test parameters 'test_iter': [test_iter], 'test_interval': 2000, 'eval_type': "detection", 'ap_version': "11point", 'test_initialization': False, ## modify this to False }
下面可以开始训练了.(训练的参数大体为: learning rate = 0.1 ,每隔20 000 iter降低为原来的0.1. 每2 000iter一次snapshot, 最大iter为100 000, 每2 000iter test一次.具体请看代码)
-
3) 测试阶段
i) 修改score_DSOD300_pascal.py文件
修改的部分与训练文件DSOD300_pascal.py类似,这里不做说明.
ii) 关于单个图像的测试DSOD300_detection_demo.py
修改的部分为:
# load PASCAL VOC labels
labelmap_file = 'data/myData/labelmap_voc.prototxt'
#Load the net in the test phase for inference, and configure input preprocessing.
model_def = 'models/DSOD300/myData/DSOD300_300x300/deploy.prototxt'
model_weights = 'models/DSOD300/myData/DSOD300_300x300/DSOD300_myData_DSOD300_300x300_iter_68000.caffemodel'
#Load an image.
img = "/home/XXX/Data/myData/test/000520.jpg"
image = caffe.io.load_image(img)
plt.imshow(image)
## alternative
# you can set the threshold for your data
# Get detections with confidence higher than 0.6.
top_indices = [i for i, conf in enumerate(det_conf) if conf >= 0.6]
iii) 关于自动测试指定目录下的所有图像的说明
依据 DSOD300_detection_demo.py文件,进行修改. (这里使用opencv来读取数据)
注意: opencv读取的图像通道为BGR,需要修改为RGB的,而且caffe.io.load_image读取的图像为[0, 1]之间的数值,而opencv读取的图像为[0, 255].
主要的修改代码为:
image_Ori = cv2.imread(img)
image = image_Ori.copy() ## just for calculating
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = image.astype("float") / 255.
..........
##plot rectangle on Original Image
cv2.rectangle(image_Ori, (xmin, ymin), (xmax, ymax), (0, 255, 0), 2)
# parameters: image left-UP point right-down point color(also BGR) color-size
cv2.putText(image_Ori, display_txt, (xmin, ymin), cv2.FONT_HERSHEY_PLAIN, 2, (255, 0, 0), 2)
# parameters:# image show text show point font font size font-color
...........
cv2.imwrite(tempsaveImageDir, image_Ori)
-
3. 论文的理解
具体请参考下次的博客.
There may be some mistakes in this blog. So, any suggestions and comments are welcome!
【Reference】
[1] https://github.com/weiliu89/caffe/tree/ssd
[2] https://github.com/szq0214/DSOD
[3] http://openaccess.thecvf.com/content_ICCV_2017/papers/Shen_DSOD_Learning_Deeply_ICCV_2017_paper.pdf