图像地点场景类型识别(PlaceCNN)实践

  从图像中判断图像场景所处的地点类型,是图像理解的一种常见任务。本质上场景类别标注数据足够的情况下,它可以属于图像分类的一种,因此直接利用现有成熟的网络架构如ResNet就可以实现较高精度的图像涉及场所的识别。

  本文实践采自:http://places2.csail.mit.edu/download.html

       该数据集涵盖了365种图像场景,同时还提供了多种网络架构的预训练模型,主要如下:

Pre-trained CNN models on Places365-Standard:

  • AlexNet-places365: deploy weights
  • GoogLeNet-places365: deploy weights
  • VGG16-places365: deploy weights
  • VGG16-hybrid1365: deploy weights
  • ResNet152-places365 fine-tuned from ResNet152-ImageNetdeploy weights
  • ResNet152-hybrid1365: deploy weights
  • ResNet152-places365 trained from scratch using Torch: torch model converted caffemodel:deploy weights. It is the original ResNet with 152 layers. On the validation set, the top1 error is 45.26% and the top5 error is 15.02%.
  • ResNet50-places365 trained from scratch using Torch: torch model. It is Preact ResNet with 50 layers. The top1 error is 44.82% and the top5 error is 14.71%.
  • To use the alexnet and vgg16 caffemodels in Torch, use the torch library loadcaffe, where you could simply load the caffe model use the following commands. But note that the input image scale should be from 0-255, which is different to the 0-1 scale in the previous resnet Torch models trained from scratch in fb.resnet.torch.


2、实验结果



将上图地点分类为:酒巴、饭店或者咖啡屋。



这是数据集中的一张测试照片,定义为会议室。




这个候车厅的识别也是非常准确的。


见:https://timgsa.baidu.com/timg?image&quality=80&size=b9999_10000&sec=1523878667027&di=287398ec5e55869341ba2747794612a3&imgtype=0&src=http%3A%2F%2Fimg.pconline.com.cn%2Fimages%2Fphotoblog%2F8%2F7%2F0%2F0%2F8700542%2F20094%2F30%2F1241086150942.jpg



篮球场也在前三名之内




港口码头也在前几位。

猜你喜欢

转载自blog.csdn.net/sparkexpert/article/details/79962804