一、基本信息
标题:《Integrating Stereo Vision with a CNN Tracker for a Person-Following Robot
Bao》
时间:2017
出版源:Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
论文领域:CNN tracker, person following robot, tracking, stereo vision
主要链接:
homepage:None
arXiv(Paper):None
github:None
二、研究背景
跟随机器人的人有很多应用,比如杂货店里的自动推车[26],医院里的个人向导,或者机场里的自动手提箱[1]。动态环境下跟随机器人的人需要解决不同挑战性情况下的跟踪问题(外观变化、光照变化、遮挡、蹲姿、换衣等姿势变化)。利用在线卷积神经网络(CNN)对不同情况下的给定目标进行跟踪。被跟踪的目标可能会在拐角处移动,使其从机器人的视野中消失。我们通过计算目标最近的姿态来解决这个问题,当目标在当前帧中不可见时,让机器人复制目标的本地路径。正在使用的机器人是一个配备了立体声摄像机的先锋3AT机器人。我们用两台立体声摄像机测试了我们的方法,分别是Point Grey bumblebee21和ZED立体声摄像机。
三、创新点
3.1 概述
(1)利用实时训练的CNN(约20fps)利用RGB图像和立体深度图像进行跟踪的人跟踪机器人应用
(2)机器人估计和复制目标的局部路径,即使人暂时不在摄像机视野内也能跟踪人的机器人跟踪行为
(3)一种新颖的用于人员跟踪任务的立体数据集。
3.2 详解方法
在这里,描述了提出的CNN模型和学习过程。CNN的输入是RGB通道,从立体图像计算深度,称之为RGBSD (RGB- stereo depth)。立体深度(SD)是使用ZED SDK3计算的。CNN跟踪器输出目标的深度和质心。机器人的导航模块使用深度和质心来跟踪目标并在需要时复制路径。
3.2.1CNN models with RGBSD images
-
第一个模型(CNN v1)使用RGBSD层作为单个图像来馈送ConvNet。与传统CNN架构类似,网络包含卷积层、完全连接层和输出层(见图1)。
-
第二个模型(CNN v2)使用2个卷积流,输入是一个流的RGB通道,另一个流的输入是立体深度图像(见图1)。在完全连接层中,输入是来自这两个卷积流的平坦输出的组合。
-
第三个ConvNet(CNN v3)是一个基于常规RGB图像的CNN。它的结构与第一个模型相似。现在我们描述我们初始化和更新CNN跟踪器的方法。
初始训练集选择:为了使用CNN模型来跟踪一个人,我们必须初始化CNN分类器。初始化是使用随机权重从头开始的。
(1)在第一帧的中心放置一个预定义的矩形边界框。要激活机器人跟随行为,必须有人站在距离机器人一定距离的包围盒内,或者可以手动选择要跟踪的目标。(2)一旦CNN被激活,边界框中的patch将被标记为class 1。边界框周围的面片被标记为class 0。
(3)由于这两个类是高度不平衡的,我们从0类中统一选择n个补丁,并将1类补丁复制n次到形成训练集(在我们的实验中n=40)。这个初始训练集用于训练CNN分类器,直到它在训练集上具有很高的精度。这可能会使分类器超出训练集。为了处理这种强烈的过度拟合,我们假设目标的姿态和外观在前50帧(大约2-3秒)内不会发生显著变化。
测试集选择:一旦CNN分类器被初始化或更新,我们使用它来检测下一帧中的目标。当新的帧与立体深度层同时可用时,我们在局部图像区域搜索测试patch,如图2(a)所示。我们还对搜索空间的深度进行了限制,如图2(b)所示。如果图像中的补丁没有深度在先前的深度±α,我们不认为它们(图2 ©),其中α是搜索区域在深度方向上(我们使用α= 0.25米)。通过这样做,大部分属于背景的补丁在传递到CNN分类器之前都会被过滤掉。只有class-1上的最高响应才会被认为是当前帧的目标。如果在0.5秒后没有检测到目标(例如,class-1 < 0.5的最高响应),它将进入目标丢失模式。然后,扫描整个图像以创建一个测试集。
更新分类器,需要选择一个新的训练集。更新步骤只有在检测执行步骤发现目标在测试集(1级)。为了保持健壮性、最近的50个1级补丁保留前一帧形成的1级补丁池被实现为一个先进先出队列。目标周围的补丁形成类-0补丁池。在这个新的训练集中,我们再次从class-0 patch pool中均匀地选择n个patch。从1级补丁池选择n补丁,我们样本补丁基于泊松分布和λ= 1.0 k = bqueue指数10 c(见方程1和图3)。这给更高的概率选择补丁从最近的历史,而不是选择年长的补丁。这个训练集用于更新分类器。基于泊松分布的类-1补丁抽样避免了过度拟合,并提供了一个机会,以从前一帧的错误检测中恢复。
3.2.2 Navigation of the Robot
有两种情况:
(i)机器人可以看到图像中的目标(人),使用比例积分微分(PID)控制器,
(ii)当机器人看不到目标时,复制目标的路径。
定位:机器人的定位需要根据全局坐标系来估计机器人的姿态。在2d的情况下,这是x, y坐标和方向,θ的机器人。机器人在遇到动态障碍物时,必须保持对自身姿态的估计。
四、实验结果
五、结论与思考
5.1 作者结论
5.2 记录该工作的亮点,以及可以改进的地方:只能够跟随不能臂章
参考
- Alves-Oliveira, P., Paiva, A.: A study on trust in a robotic suitcase. In: So-
cial Robotics: 8th International Conference, ICSR 2016, Kansas City, MO, USA,
November 1-3, 2016 Proceedings. vol. 9979, p. 179. Springer (2016) - Awai, M., Shimizu, T., Kaneko, T., Yamashita, A., Asama, H.: Hog-based per-
son following and autonomous returning using generated map by mobile robot
equipped with camera and laser range finder. In: Intelligent Autonomous Systems
12, pp. 51–60. Springer (2013) - Borenstein, J., Feng, L.: Umbmark: A benchmark test for measuring odometry
errors in mobile robots. In: Photonics East’95. pp. 113–124. International Society
for Optics and Photonics (1995) - Calisi, D., Iocchi, L., Leone, R.: Person following through appearance models and
stereo vision using a mobile robot. In: VISAPP (Workshop on on Robot Vision).
pp. 46–56 (2007) - Camplani, M., Hannuna, S.L., Mirmehdi, M., Damen, D., Paiement, A., Tao, L.,
Burghardt, T.: Real-time rgb-d tracking with depth scaling kernelised correlation
filters and occlusion handling. In: British Machine Vision Conference, Swansea,
UK, September 7-10, 2015. BMVA Press (2015) - Chen, B.X., Sahdev, R., Tsotsos, J.K.: Person following robot using selected online
ada-boosting with stereo camera. In: Computer and Robot Vision (CR V), 2017
14th Conference on. pp. 48–55. IEEE (2017) - Chen, Z., Birchfield, S.T.: Person following with a mobile robot using binocu-
lar feature-based tracking. In: Intelligent Robots and Systems, 2007. IROS 2007.
IEEE/RSJ International Conference on. pp. 815–820. IEEE (2007) - Chivil` o, G., Mezzaro, F., Sgorbissa, A., Zaccaria, R.: Follow-the-leader behaviour
through optical flow minimization. In: Intelligent Robots and Systems, 2004.(IROS
2004). Proceedings. 2004 IEEE/RSJ International Conference on. vol. 4, pp. 3182– - IEEE (2004)
- Cosgun, A., Florencio, D.A., Christensen, H.I.: Autonomous person following for
telepresence robots. In: Robotics and Automation (ICRA), 2013 IEEE Interna-
tional Conference on. pp. 4335–4342. IEEE (2013) - Couprie, C., Farabet, C., Najman, L., Lecun, Y.: Indoor semantic segmentation
using depth information. In: International Conference on Learning Representations
(ICLR2013), April 2013 (2013) - Danelljan, M., H¨ ager, G., Khan, F., Felsberg, M.: Accurate scale estimation for ro-
bust visual tracking. In: British Machine Vision Conference, Nottingham, Septem-
ber 1-5, 2014. BMV A Press (2014) - Doisy, G., Jevtic, A., Lucet, E., Edan, Y.: Adaptive person-following algorithm
based on depth images and mapping. In: Proc. of the IROS Workshop on Robot
Motion Planning (2012) - Eitel, A., Springenberg, J.T., Spinello, L., Riedmiller, M., Burgard, W.: Multi-
modal deep learning for robust rgb-d object recognition. In: Intelligent Robots and
Systems (IROS), 2015 IEEE/RSJ International Conference on. pp. 681–687. IEEE
(2015) - Fan, J., Xu, W., Wu, Y., Gong, Y.: Human tracking using convolutional neural
networks. IEEE Transactions on Neural Networks 21(10), 1610–1623 (2010) - Fuentes-Pacheco, J., Ruiz-Ascencio, J., Rend´ on-Mancha, J.M.: Visual simultane-
ous localization and mapping: a survey. Artificial Intelligence Review 43(1), 55–81
(2015) - Gao, C., Chen, F., Yu, J.G., Huang, R., Sang, N.: Robust visual tracking using
exemplar-based detectors. IEEE Transactions on Circuits and Systems for Video
Technology (2015) - Gao, C., Shi, H., Yu, J.G., Sang, N.: Enhancement of elda tracker based on cnn
features and adaptive model update. Sensors 16(4), 545 (2016) - Grabner, H., Grabner, M., Bischof, H.: Real-time tracking via on-line boosting. In:
Proceedings of the British Machine Vision Conference 2006, Edinburgh. pp. 47–56
(2006) - Gupta, S., Girshick, R., Arbel´ aez, P., Malik, J.: Learning rich features from rgb-d
images for object detection and segmentation. In: European Conference on Com-
puter Vision. pp. 345–360. Springer (2014) - Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S.L., Torr, P.H.:
Struck: Structured output tracking with kernels. IEEE transactions on pattern
analysis and machine intelligence 38(10), 2096–2109 (2016) - Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative
saliency map with convolutional neural network. In: ICML. pp. 597–606 (2015) - Hua, Y., Alahari, K., Schmid, C.: Online object tracking with proposal selection.
In: The IEEE International Conference on Computer Vision (ICCV) (December
- Kanbara, M., Okuma, T., Takemura, H., Yokoya, N.: A stereoscopic video see-
through augmented reality system based on real-time vision-based registration. In:
Virtual Reality, 2000. Proceedings. IEEE. pp. 255–262. IEEE (2000) - Kobilarov, M., Sukhatme, G., Hyams, J., Batavia, P.: People tracking and following
with mobile robot using an omnidirectional camera and a laser. In: Robotics and
Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference
on. pp. 557–562. IEEE (2006) - Koide, K., Miura, J.: Identification of a specific person using color, height, and
gait features for a person following robot. Robotics and Autonomous Systems 84,
76–87 (2016) - Nishimura, S., Itou, K., Kikuchi, T., Takemura, H., Mizoguchi, H.: A study of
robotizing daily items for an autonomous carrying system-development of person
following shopping cart robot. In: Control, Automation, Robotics and Vision, 2006.
ICARCV’06. 9th International Conference on. pp. 1–6. IEEE (2006) - O’Dwyer, A.: Handbook of PI and PID controller tuning rules. World Scientific
(2009) - Oron, S., Bar-Hillel, A., Levi, D., Avidan, S.: Locally orderless tracking. Interna-
tional Journal of Computer Vision 111(2), 213–228 (2015) - Sardari, F., Moghaddam, M.E.: A hybrid occlusion free object tracking method
using particle filter and modified galaxy based search meta-heuristic algorithm.
Applied Soft Computing 50, 280–299 (2017) - Satake, J., Chiba, M., Miura, J.: A sift-based person identification using a distance-
dependent appearance model for a person following robot. In: Robotics and
Biomimetics (ROBIO), 2012 IEEE International Conference on. pp. 962–967. IEEE
(2012) - Schlegel, C., Jaberg, H., Schuster, M.: Vision based person tracking with a mobile
robot. In: In Proc. British Machine Vision Conf. Citeseer (1998) - Song, S., Xiao, J.: Tracking revisited using rgbd camera: Unified benchmark and
baselines. In: Proceedings of the IEEE international conference on computer vision.
pp. 233–240 (2013) - Takemura, H., Ito, K., Mizoguchi, H.: Person following mobile robot under varying
illumination based on distance and color information. In: Robotics and Biomimet-
ics, 2007. ROBIO 2007. IEEE International Conference on. pp. 1500–1505. IEEE
(2007) - Tarokh, M., Ferrari, P.: Case study: Robotic person following using fuzzy control
and image segmentation. Journal of Field Robotics 20(9), 557–568 (2003) - Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Transactions on
Pattern Analysis and Machine Intelligence 37(9), 1834–1848 (2015) - Yamane, T., Shirai, Y., Miura, J.: Person tracking by integrating optical flow and
uniform brightness regions. In: Robotics and Automation, 1998. Proceedings. 1998
IEEE International Conference on. vol. 4, pp. 3267–3272. IEEE (1998) - Yoshimi, T., Nishiyama, M., Sonoura, T., Nakamoto, H., Tokura, S., Sato, H.,
Ozaki, F., Matsuhira, N., Mizoguchi, H.: Development of a person following robot
with vision based target detection. In: Intelligent Robots and Systems, 2006
IEEE/RSJ International Conference on. pp. 5286–5291. IEEE (2006) - Zhai, M., Roshtkhari, M.J., Mori, G.: Deep learning of appearance models for
online object tracking. arXiv preprint arXiv:1607.02568 (2016) - Zhang, K., Zhang, L., Yang, M.H.: Real-time object tracking via online discrimina-
tive feature selection. IEEE Transactions on Image Processing 22(12), 4664–4677
(2013) - Zhang, L., Suganthan, P.N.: Visual tracking with convolutional neural network. In:
Systems, Man, and Cybernetics (SMC), 2015 IEEE International Conference on.
pp. 2072–2077. IEEE (2015) - Zhang, L., van der Maaten, L.: Structure preserving object tracking. In: Proceed-
ings of the IEEE conference on computer vision and pattern recognition. pp. 1838–
1845 (2013)