resnet-ssd使用tensorRT部署.md

mobilenet-ssd使用tensorRT部署

一，将deploy.prototxt改写为deploy_plugin.prototxt

1,convolution层的param{}全部去掉，convolution_param中的weight_filter{}去掉，bias_filter{}去掉

2,将自定义层的名字改写为IPlugin，自定义层的参数写在新写的class里面

3,ssd的detection_out层的detection_output_param去掉，然后新加一个top:detection_out2
因为在tensorRT中，默认的输出是两个，如果只有一个top，那么程序会报错。
“Plugin layer output count is not equal to caffe output count”.

二，完成自定义层的代码

1,先实现PluginFactory的class，继承IPluginFactory，class中定义需要使用的自定义层

2,实现 bool PluginFactory::isPlugin(const char* name)

3,实现

nvinfer1::IPlugin* PluginFactory::createPlugin(const char* layerName, const nvinfer1::Weights* weights, int nbWeights)

通过if语句判断是属于哪一层，然后在条件分支中实现新的层的参数传递，return mXXXX_layer.get()返回指针

三,tensorRT的标准使用步骤

1,caffeToTRTModel

void TensorNet::caffeToTRTModel(const std::string& deployFile, const std::string& modelFile, const std::vector<std::string>& outputs,
                                unsigned int maxBatchSize)
{
    IBuilder* builder = createInferBuilder(gLogger);
    INetworkDefinition* network = builder->createNetwork();

    ICaffeParser* parser = createCaffeParser();
    parser->setPluginFactory(&pluginFactory);

    bool useFp16 = builder->platformHasFastFp16();
    useFp16 = false;

    DataType modelDataType = useFp16 ? DataType::kHALF : DataType::kFLOAT;

    std::cout << deployFile.c_str() <<std::endl;
    std::cout << modelFile.c_str() <<std::endl;
    //std::cout << (*network) <<std::endl;
    std::cout << "Here : 1"<<std::endl;
    const IBlobNameToTensor* blobNameToTensor =	parser->parse(deployFile.c_str(),modelFile.c_str(),*network,                DataType::kFLOAT);
    std::cout << "Here : 2" <<std::endl;
    assert(blobNameToTensor != nullptr);
    std::cout << "Here : 3" <<std::endl;
    for (auto& s : outputs) network->markOutput(*blobNameToTensor->find(s.c_str()));

    builder->setMaxBatchSize(maxBatchSize);
    builder->setMaxWorkspaceSize(10 << 20);
    std::cout << "Here : 4"<< std::endl;
    ICudaEngine* engine = builder->buildCudaEngine( *network );
    std::cout << "Here : 5"<<std::endl;
    assert(engine);
    network->destroy();
    parser->destroy();
    gieModelStream = engine->serialize();
    engine->destroy();
    builder->destroy();
    pluginFactory.destroyPlugin();
    shutdownProtobufLibrary();

}

如果这个函数执行完成没有问题，那么整个网络结构修改的没有问题。

问题1：
我目前卡在了const IBlobNameToTensor* blobNameToTensor = parser->parse(),QtCreator提示The program has unexpectedly finished.估计是访问了非法内存还是咋地，调试也不好使。
2018,10,10 这个问题真是自己傻逼了，prototxt改写plugin的时候改错了，有一层没有输入，检查也没有发现这个问题，tensorRT运行的时候直接segmentation fault。
总结一下就是，如果遇到一个问题，百度谷歌全部都找不到相关信息，那么这个问题只有两种可能，一是这个问题很牛逼，别人都还没有发现;二是这个问题很傻逼，别人发现了不想写出来。对于我来说，遇到了牛逼问题的可能性很小，所以一般都是自己傻逼了。找这种bug的时候，把自己当成一个傻逼，往自己觉得自己不可能犯错的地方取查

问题2：

ICudaEngine* engine = builder->buildCudaEngine( *network );
assert(engine);

assertion error,创建engine失败
最后事实证明，这也是一个弱智问题，问题还是存在于prototxt中，修改类别之后的相应的num_output不匹配导致的问题，修改之后就可以了。

问题3：
virtual void nvinfer1::plugin::DetectionOutput::configure(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, int): Assertion `numPriorsnumLocClasses4 == inputDims[param.inputOrder[0]].d[0]’ failed.
感觉这个问题会很棘手，在github上有人正在讨论。
https://github.com/dusty-nv/jetson-inference/issues/171
https://github.com/chenzhi1992/TensorRT-SSD/issues/26
我尝试了上面所说的方法，将我的tensorRT升级到tensorRT4.0 然后在detection_out层的参数中加入inputOrder = {0,1,2},然后最后的结果还是一样，错误依然存在
分别打印出detection_out层的输入 mbox_conf_softmax,mbox_loc和mbox_priorbox的尺寸
我这里分类是5类（4 + background）
分别 C H W
mbox_conf_softmax ------> 5 1917 1
mbox_loc --------->7668 1 1
mbox_priorbox --------->2 10184 1

resnet-ssd使用tensorRT部署.md

mobilenet-ssd使用tensorRT部署

猜你喜欢