一、基本信息

标题：《Integrating Stereo Vision with a CNN Tracker for a Person-Following Robot
Bao》

时间：2017

出版源：Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

论文领域：CNN tracker, person following robot, tracking, stereo vision

主要链接：

homepage：None

arXiv（Paper）：None

github：None

二、研究背景

跟随机器人的人有很多应用，比如杂货店里的自动推车[26]，医院里的个人向导，或者机场里的自动手提箱[1]。动态环境下跟随机器人的人需要解决不同挑战性情况下的跟踪问题(外观变化、光照变化、遮挡、蹲姿、换衣等姿势变化)。利用在线卷积神经网络(CNN)对不同情况下的给定目标进行跟踪。被跟踪的目标可能会在拐角处移动，使其从机器人的视野中消失。我们通过计算目标最近的姿态来解决这个问题，当目标在当前帧中不可见时，让机器人复制目标的本地路径。正在使用的机器人是一个配备了立体声摄像机的先锋3AT机器人。我们用两台立体声摄像机测试了我们的方法，分别是Point Grey bumblebee21和ZED立体声摄像机。

三、创新点

3.1 概述

（1）利用实时训练的CNN（约20fps）利用RGB图像和立体深度图像进行跟踪的人跟踪机器人应用
（2）机器人估计和复制目标的局部路径，即使人暂时不在摄像机视野内也能跟踪人的机器人跟踪行为
（3）一种新颖的用于人员跟踪任务的立体数据集。

3.2 详解方法

在这里，描述了提出的CNN模型和学习过程。CNN的输入是RGB通道，从立体图像计算深度，称之为RGBSD (RGB- stereo depth)。立体深度(SD)是使用ZED SDK3计算的。CNN跟踪器输出目标的深度和质心。机器人的导航模块使用深度和质心来跟踪目标并在需要时复制路径。
在这里插入图片描述

3.2.1CNN models with RGBSD images

第一个模型（CNN v1）使用RGBSD层作为单个图像来馈送ConvNet。与传统CNN架构类似，网络包含卷积层、完全连接层和输出层（见图1）。
第二个模型（CNN v2）使用2个卷积流，输入是一个流的RGB通道，另一个流的输入是立体深度图像（见图1）。在完全连接层中，输入是来自这两个卷积流的平坦输出的组合。
第三个ConvNet（CNN v3）是一个基于常规RGB图像的CNN。它的结构与第一个模型相似。现在我们描述我们初始化和更新CNN跟踪器的方法。

初始训练集选择：为了使用CNN模型来跟踪一个人，我们必须初始化CNN分类器。初始化是使用随机权重从头开始的。
(1)在第一帧的中心放置一个预定义的矩形边界框。要激活机器人跟随行为，必须有人站在距离机器人一定距离的包围盒内，或者可以手动选择要跟踪的目标。

(2)一旦CNN被激活，边界框中的patch将被标记为class 1。边界框周围的面片被标记为class 0。

(3)由于这两个类是高度不平衡的，我们从0类中统一选择n个补丁，并将1类补丁复制n次到形成训练集（在我们的实验中n=40）。这个初始训练集用于训练CNN分类器，直到它在训练集上具有很高的精度。这可能会使分类器超出训练集。为了处理这种强烈的过度拟合，我们假设目标的姿态和外观在前50帧（大约2-3秒）内不会发生显著变化。

测试集选择:一旦CNN分类器被初始化或更新，我们使用它来检测下一帧中的目标。当新的帧与立体深度层同时可用时，我们在局部图像区域搜索测试patch，如图2(a)所示。我们还对搜索空间的深度进行了限制，如图2(b)所示。如果图像中的补丁没有深度在先前的深度±α,我们不认为它们(图2 ©),其中α是搜索区域在深度方向上(我们使用α= 0.25米)。通过这样做，大部分属于背景的补丁在传递到CNN分类器之前都会被过滤掉。只有class-1上的最高响应才会被认为是当前帧的目标。如果在0.5秒后没有检测到目标(例如，class-1 < 0.5的最高响应)，它将进入目标丢失模式。然后，扫描整个图像以创建一个测试集。
在这里插入图片描述
更新分类器，需要选择一个新的训练集。更新步骤只有在检测执行步骤发现目标在测试集(1级)。为了保持健壮性、最近的50个1级补丁保留前一帧形成的1级补丁池被实现为一个先进先出队列。目标周围的补丁形成类-0补丁池。在这个新的训练集中，我们再次从class-0 patch pool中均匀地选择n个patch。从1级补丁池选择n补丁,我们样本补丁基于泊松分布和λ= 1.0 k = bqueue指数10 c(见方程1和图3)。这给更高的概率选择补丁从最近的历史,而不是选择年长的补丁。这个训练集用于更新分类器。基于泊松分布的类-1补丁抽样避免了过度拟合，并提供了一个机会，以从前一帧的错误检测中恢复。

3.2.2 Navigation of the Robot

有两种情况:
(i)机器人可以看到图像中的目标(人)，使用比例积分微分(PID)控制器，
(ii)当机器人看不到目标时，复制目标的路径。

定位：机器人的定位需要根据全局坐标系来估计机器人的姿态。在2d的情况下,这是x, y坐标和方向,θ的机器人。机器人在遇到动态障碍物时，必须保持对自身姿态的估计。

四、实验结果

五、结论与思考

5.1 作者结论

5.2 记录该工作的亮点，以及可以改进的地方：只能够跟随不能臂章

参考

Alves-Oliveira, P., Paiva, A.: A study on trust in a robotic suitcase. In: So-
cial Robotics: 8th International Conference, ICSR 2016, Kansas City, MO, USA,
November 1-3, 2016 Proceedings. vol. 9979, p. 179. Springer (2016)
Awai, M., Shimizu, T., Kaneko, T., Yamashita, A., Asama, H.: Hog-based per-
son following and autonomous returning using generated map by mobile robot
equipped with camera and laser range finder. In: Intelligent Autonomous Systems
12, pp. 51–60. Springer (2013)
Borenstein, J., Feng, L.: Umbmark: A benchmark test for measuring odometry
errors in mobile robots. In: Photonics East’95. pp. 113–124. International Society
for Optics and Photonics (1995)
Calisi, D., Iocchi, L., Leone, R.: Person following through appearance models and
stereo vision using a mobile robot. In: VISAPP (Workshop on on Robot Vision).
pp. 46–56 (2007)
Camplani, M., Hannuna, S.L., Mirmehdi, M., Damen, D., Paiement, A., Tao, L.,
Burghardt, T.: Real-time rgb-d tracking with depth scaling kernelised correlation
filters and occlusion handling. In: British Machine Vision Conference, Swansea,
UK, September 7-10, 2015. BMVA Press (2015)
Chen, B.X., Sahdev, R., Tsotsos, J.K.: Person following robot using selected online
ada-boosting with stereo camera. In: Computer and Robot Vision (CR V), 2017
14th Conference on. pp. 48–55. IEEE (2017)
Chen, Z., Birchfield, S.T.: Person following with a mobile robot using binocu-
lar feature-based tracking. In: Intelligent Robots and Systems, 2007. IROS 2007.
IEEE/RSJ International Conference on. pp. 815–820. IEEE (2007)
Chivil` o, G., Mezzaro, F., Sgorbissa, A., Zaccaria, R.: Follow-the-leader behaviour
through optical flow minimization. In: Intelligent Robots and Systems, 2004.(IROS
2004). Proceedings. 2004 IEEE/RSJ International Conference on. vol. 4, pp. 3182–
IEEE (2004)
Cosgun, A., Florencio, D.A., Christensen, H.I.: Autonomous person following for
telepresence robots. In: Robotics and Automation (ICRA), 2013 IEEE Interna-
tional Conference on. pp. 4335–4342. IEEE (2013)
Couprie, C., Farabet, C., Najman, L., Lecun, Y.: Indoor semantic segmentation
using depth information. In: International Conference on Learning Representations
(ICLR2013), April 2013 (2013)
Danelljan, M., H¨ ager, G., Khan, F., Felsberg, M.: Accurate scale estimation for ro-
bust visual tracking. In: British Machine Vision Conference, Nottingham, Septem-
ber 1-5, 2014. BMV A Press (2014)
Doisy, G., Jevtic, A., Lucet, E., Edan, Y.: Adaptive person-following algorithm
based on depth images and mapping. In: Proc. of the IROS Workshop on Robot
Motion Planning (2012)
Eitel, A., Springenberg, J.T., Spinello, L., Riedmiller, M., Burgard, W.: Multi-
modal deep learning for robust rgb-d object recognition. In: Intelligent Robots and
Systems (IROS), 2015 IEEE/RSJ International Conference on. pp. 681–687. IEEE
(2015)
Fan, J., Xu, W., Wu, Y., Gong, Y.: Human tracking using convolutional neural
networks. IEEE Transactions on Neural Networks 21(10), 1610–1623 (2010)
Fuentes-Pacheco, J., Ruiz-Ascencio, J., Rend´ on-Mancha, J.M.: Visual simultane-
ous localization and mapping: a survey. Artificial Intelligence Review 43(1), 55–81
(2015)
Gao, C., Chen, F., Yu, J.G., Huang, R., Sang, N.: Robust visual tracking using
exemplar-based detectors. IEEE Transactions on Circuits and Systems for Video
Technology (2015)
Gao, C., Shi, H., Yu, J.G., Sang, N.: Enhancement of elda tracker based on cnn
features and adaptive model update. Sensors 16(4), 545 (2016)
Grabner, H., Grabner, M., Bischof, H.: Real-time tracking via on-line boosting. In:
Proceedings of the British Machine Vision Conference 2006, Edinburgh. pp. 47–56
(2006)
Gupta, S., Girshick, R., Arbel´ aez, P., Malik, J.: Learning rich features from rgb-d
images for object detection and segmentation. In: European Conference on Com-
puter Vision. pp. 345–360. Springer (2014)
Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S.L., Torr, P.H.:
Struck: Structured output tracking with kernels. IEEE transactions on pattern
analysis and machine intelligence 38(10), 2096–2109 (2016)
Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative
saliency map with convolutional neural network. In: ICML. pp. 597–606 (2015)
Hua, Y., Alahari, K., Schmid, C.: Online object tracking with proposal selection.
In: The IEEE International Conference on Computer Vision (ICCV) (December

Kanbara, M., Okuma, T., Takemura, H., Yokoya, N.: A stereoscopic video see-
through augmented reality system based on real-time vision-based registration. In:
Virtual Reality, 2000. Proceedings. IEEE. pp. 255–262. IEEE (2000)
Kobilarov, M., Sukhatme, G., Hyams, J., Batavia, P.: People tracking and following
with mobile robot using an omnidirectional camera and a laser. In: Robotics and
Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference
on. pp. 557–562. IEEE (2006)
Koide, K., Miura, J.: Identification of a specific person using color, height, and
gait features for a person following robot. Robotics and Autonomous Systems 84,
76–87 (2016)
Nishimura, S., Itou, K., Kikuchi, T., Takemura, H., Mizoguchi, H.: A study of
robotizing daily items for an autonomous carrying system-development of person
following shopping cart robot. In: Control, Automation, Robotics and Vision, 2006.
ICARCV’06. 9th International Conference on. pp. 1–6. IEEE (2006)
O’Dwyer, A.: Handbook of PI and PID controller tuning rules. World Scientific
(2009)
Oron, S., Bar-Hillel, A., Levi, D., Avidan, S.: Locally orderless tracking. Interna-
tional Journal of Computer Vision 111(2), 213–228 (2015)
Sardari, F., Moghaddam, M.E.: A hybrid occlusion free object tracking method
using particle filter and modified galaxy based search meta-heuristic algorithm.
Applied Soft Computing 50, 280–299 (2017)
Satake, J., Chiba, M., Miura, J.: A sift-based person identification using a distance-
dependent appearance model for a person following robot. In: Robotics and
Biomimetics (ROBIO), 2012 IEEE International Conference on. pp. 962–967. IEEE
(2012)
Schlegel, C., Jaberg, H., Schuster, M.: Vision based person tracking with a mobile
robot. In: In Proc. British Machine Vision Conf. Citeseer (1998)
Song, S., Xiao, J.: Tracking revisited using rgbd camera: Unified benchmark and
baselines. In: Proceedings of the IEEE international conference on computer vision.
pp. 233–240 (2013)
Takemura, H., Ito, K., Mizoguchi, H.: Person following mobile robot under varying
illumination based on distance and color information. In: Robotics and Biomimet-
ics, 2007. ROBIO 2007. IEEE International Conference on. pp. 1500–1505. IEEE
(2007)
Tarokh, M., Ferrari, P.: Case study: Robotic person following using fuzzy control
and image segmentation. Journal of Field Robotics 20(9), 557–568 (2003)
Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Transactions on
Pattern Analysis and Machine Intelligence 37(9), 1834–1848 (2015)
Yamane, T., Shirai, Y., Miura, J.: Person tracking by integrating optical flow and
uniform brightness regions. In: Robotics and Automation, 1998. Proceedings. 1998
IEEE International Conference on. vol. 4, pp. 3267–3272. IEEE (1998)
Yoshimi, T., Nishiyama, M., Sonoura, T., Nakamoto, H., Tokura, S., Sato, H.,
Ozaki, F., Matsuhira, N., Mizoguchi, H.: Development of a person following robot
with vision based target detection. In: Intelligent Robots and Systems, 2006
IEEE/RSJ International Conference on. pp. 5286–5291. IEEE (2006)
Zhai, M., Roshtkhari, M.J., Mori, G.: Deep learning of appearance models for
online object tracking. arXiv preprint arXiv:1607.02568 (2016)
Zhang, K., Zhang, L., Yang, M.H.: Real-time object tracking via online discrimina-
tive feature selection. IEEE Transactions on Image Processing 22(12), 4664–4677
(2013)
Zhang, L., Suganthan, P.N.: Visual tracking with convolutional neural network. In:
Systems, Man, and Cybernetics (SMC), 2015 IEEE International Conference on.
pp. 2072–2077. IEEE (2015)
Zhang, L., van der Maaten, L.: Structure preserving object tracking. In: Proceed-
ings of the IEEE conference on computer vision and pattern recognition. pp. 1838–
1845 (2013)

被窝里的奶油卷

发布了11 篇原创文章 · 获赞 0 · 访问量 272

私信关注

论文总结《Integrating Stereo Vision with a CNN Tracker for a Person-Following Robot Bao》