Caffe学习系列文章:
Caffe学习(一):安装Windows版Caffe与Faster RCNN过程中的报错及解决方法
Caffe学习(二):Windows训练与测试Caffe mnist
Caffe学习(三):Caffe solver文件参数详细解析
Caffe学习(四):Windows使用Cifar10训练及测试Caffe版DenseNet
Caffe学习(五):Caffe py-Faster-RCNN 源码解析(一)
Caffe学习(七):Caffe添加自定义层(2):Python层
Caffe学习(八):Windows下编译及调试Debug版Caffe
Caffe学习(十):Caffe中Solver、Net、Layer、Blob的构建流程
-------------------------------------------------------------------------------------------------------------------------------
Windows 10系统+CUDA
按照以下教程安装:
手把手从0开始安装Windows版Caffe与py-faster-RCNN(https://blog.csdn.net/AManFromEarth/article/details/80212554)
安装期间出现以上教程没有出现的问题,如下:
1. VS Build libcaffe 报错
error MSB4062: 未能从程序集 E:\NugetPackages\OpenCV.2.4.10\......的解决办法
解决方法:VS——项目——管理NuGet程序包——更新——OpenCV,选择更新(只用更新OpenCV,其他不用更新),更新后修改packages.config(VS会自动修改),并重新加载(VS会自动加载),加载后再次build libcaffe。
2. VS Build libCaffe报错:
1>C:\zhh\NugetPackages\OpenCV.2.4.10\build\native\OpenCV.targets(772,5): error : NuGet Error:操作已超时
解决方法:公司网络使用代理上外网,无法使用代理服务器或公司网络链接,改为用wifi连接手机热点,再build
3. VS Build libCaffe报错:
error MSB3721: 命令“"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe" -gencode=arch=compute_61,code=\"sm_61,compute_61\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64" -IC:\zhh\NugetPackages\OpenCV.2.4.11\build\native\../../build/native/include/ -I"C:\zhh\NugetPackages\lmdb-v120-clean.0.9.14.0\build\native\..\..\lib\native\include" -I"C:\zhh\NugetPackages\LevelDB-vc120.1.2.0.0\build\native\../..//build/native/include/" -I"C:\zhh\NugetPackages\protobuf-v120.2.6.1\build\native\../..//build/native/include/" -IC:\zhh\NugetPackages\glog.0.3.3.0\build\native\../..//build/native/include/ -IC:\zhh\NugetPackages\gflags.2.1.2.1\build\native\../..///build/native/include/ -IC:\zhh\NugetPackages\boost.1.59.0.0\build\native\..\..\lib\native\include\ -I"C:\zhh\NugetPackages\hdf5-v120-complete.1.8.15.2\build\native\..\..\lib\native\include" -IC:\zhh\NugetPackages\OpenBLAS.0.2.14.1\build\native\..\..\lib\native\include -I"C:\zhh\caffe-master\windows\libcaffe\\..\..\src\\" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" --keep-dir C:\zhh\caffe-master\windows\..\Build\Int\libcaffe\x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -Xcudafe "--diag_suppress=exception_spec_override_incompat --diag_suppress=useless_using_declaration --diag_suppress=field_without_dll_interface" -D_SCL_SECURE_NO_WARNINGS -DGFLAGS_DLL_DECL= -DHAS_OPENCV -DHAS_LMDB -DHAS_HDF5 -DHAS_OPENBLAS -DNDEBUG -D_SCL_SECURE_NO_WARNINGS -DUSE_OPENCV -DUSE_LEVELDB -DUSE_LMDB -DWITH_PYTHON_LAYER -DBOOST_PYTHON_STATIC_LIB -DUSE_CUDNN -D_UNICODE -DUNICODE -Xcompiler "/EHsc /W1 /nologo /Ox /FS /Zi /MD " -o C:\zhh\caffe-master\windows\..\Build\Int\libcaffe\x64\Release\roi_pooling_layer.cu.obj "C:\zhh\caffe-master\src\caffe\layers\roi_pooling_layer.cu"”已退出,返回代码为 2。
1>C:\zhh\caffe-master\include\caffe/util/cudnn.hpp(114): error : too few arguments in function call
解决方法:
C:\zhh\caffe-master\include\caffe\util\cudnn.hpp, 114行改为:
#if CUDNN_VERSION_MIN(6, 0, 0)
CUDNN_CHECK(cudnnSetConvolution2dDescriptor(*conv,
pad_h, pad_w, stride_h, stride_w, 1, 1, CUDNN_CROSS_CORRELATION,
dataType<Dtype>::type));
#else
CUDNN_CHECK(cudnnSetConvolution2dDescriptor(*conv,
pad_h, pad_w, stride_h, stride_w, 1, 1, CUDNN_CROSS_CORRELATION));
#endif
4.VS Build libCaffe报错:
1>C:\zhh\NugetPackages\glog.0.3.3.0\build\native\glog.targets(346,5): error MSB4062: 未能从程序集 C:\zhh\NugetPackages\gflags.2.1.2.1\build\native\\private\coapp.NuGetNativeMSBuildTasks.dll 加载任务“NuGetPackageOverlay”。未能加载文件或程序集“file:///C:\zhh\NugetPackages\gflags.2.1.2.1\build\native\private\coapp.NuGetNativeMSBuildTasks.dll”或它的某一个依赖项。系统找不到指定的文件。 请确认 <UsingTask> 声明正确,该程序集及其所有依赖项都可用,并且该任务包含实现 Microsoft.Build.Framework.ITask 的公共类。
1>C:\zhh\NugetPackages\glog.0.3.3.0\build\native\glog.targets(346,5): error MSB4062: 未能从程序集 C:\zhh\NugetPackages\gflags.2.1.2.1\build\native\\private\coapp.NuGetNativeMSBuildTasks.dll 加载任务“NuGetPackageOverlay”。未能加载文件或程序集“file:///C:\zhh\NugetPackages\gflags.2.1.2.1\build\native\private\coapp.NuGetNativeMSBuildTasks.dll”或它的某一个依赖项。系统找不到指定的文件。 请确认 <UsingTask> 声明正确,该程序集及其所有依赖项都可用,并且该任务包含实现 Microsoft.Build.Framework.ITask 的公共类。
解决方法:
这个是重新build时经常遇到的问题,删除C:\zhh\NugetPackages\gflags.2.1.2.1,然后重新编译即可
5. pyCharm中运行Faster RCNN的train_net.py报错:
[0.1 0.1 0.2 0.2]
[0.1 0.1 0.2 0.2]
[0.1 0.1 0.2 0.2]
[0.1 0.1 0.2 0.2]]
[0.1 0.1 0.2 0.2]
Normalizing targets
done
*** Check failure stack trace: ***
检查发现是train.py中以下语句报错
self.solver = caffe.SGDSolver(solver_prototxt)
解决方法:
猜测是solver_prototxt的路径问题,pycharm——Run——Edit Configurations, 将工作路径由C:\zhh\py-faster-rcnn\tools改为C:\zhh\py-faster-rcnn,并将参数中的路径改为相对路径,如:
--gpu 0 --solver .\models\pascal_voc\VGG16\faster_rcnn_end2end\solver.prototxt --imdb voc_2007_trainval --iters 100000 --weights .\output\faster_rcnn_end2end\voc_2007_trainval\vgg16_faster_rcnn_iter_30000_1.caffemodel --cfg .\experiments\cfgs\faster_rcnn_end2end.yml
6. 运行train_net.py报错:
File "C:\zhh\py-faster-rcnn\tools\..\lib\roi_data_layer\layer.py", line 87, in setup
layer_params = yaml.load(self.param_str_)
AttributeError: 'RoIDataLayer' object has no attribute 'param_str_'
解决方法:
用Notepad++将所有.py文件中的self.param_str_都改为self.param_str
7. 运行train_net.py报错:
Traceback (most recent call last):
File "C:/zhh/py-faster-rcnn/tools/train_net.py", line 112, in <module>
max_iters=args.max_iters)
File "C:\zhh\py-faster-rcnn\tools\..\lib\fast_rcnn\train.py", line 159, in train_net
model_paths = sw.train_model(max_iters)
File "C:\zhh\py-faster-rcnn\tools\..\lib\fast_rcnn\train.py", line 100, in train_model
self.solver.step(1)
File "C:\zhh\py-faster-rcnn\tools\..\lib\rpn\proposal_target_layer.py", line 66, in forward
rois_per_image, self._num_classes)
File "C:\zhh\py-faster-rcnn\tools\..\lib\rpn\proposal_target_layer.py", line 191, in _sample_rois
_get_bbox_regression_labels(bbox_target_data, num_classes)
File "C:\zhh\py-faster-rcnn\tools\..\lib\rpn\proposal_target_layer.py", line 127, in _get_bbox_regression_labels
bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
TypeError: slice indices must be integers or None or have an __index__ method
解决方法:
C:\zhh\py-faster-rcnn\lib\rpn\proposal_target_layer.py 第60行改为
fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(int)
第125行,126行的:
start = 4 * cls
end = start + 4
改为:
start = int(4 * cls)
end = int(start + 4)
8. lib目录下setup安装方法
lib目录下运行
python setup.py install
python setup.py build_ext --inplace(这一条语句生成cpu版本的pyd文件)
python setup_cuda.py install
python setup_cuda.py build_ext --inplace(这一条语句生成gpu版本的pyd文件)