Figure 1. The overall framework of the proposed tracking algorithm.

Abstract

We propose a simple but effective tracking-by-segmentation algorithm using Absorbing Markov Chain (AMC) on superpixel segmentation, where target state is estimated bya combination of bottom-up and top-down approaches, and target segmentation is propagated to subsequent frames in a recursive manner. Our algorithm constructs a graph for AMC using the superpixels identified in two consecutive frames, where background superpixels in the previous frame correspond to absorbing vertices while all other superpixels create transient ones. The weight of each edge depends on the similarity of scores in the end superpixels, which are learned by support vector regression. Once graph construction is completed, target segmentation is estimated using the absorption time of each superpixel. The proposed tracking algorithm achieves substantially improved performance compared to the state-of-the-art segmentation-based tracking techniques in multiple challenging datasets.

Performance

Results on tracking-by-segmentation
Tracking-by-segmentation aims to obtain segmentation masks given initial bounding box annotations at the first frame. AMCT and AMCT+CNN are our full algorithms and AMCT-NR and AMCT-NA is the variations of ours.





Results on online semi-supervised video segmentation
Online semi-supervised video segmentation aims to propagate segmentation masks given initial segmentation annotations at the first frame. This task is differentiated from our main problem, tracking-by-segmentation, which has bounding box annotations at the first frame.



Quliatative Results

Result File

Segmentation results on five datasets, non-rigid object tracking dataset (NR) [34], generalized background subtraction dataset (GBS) [22, 25, 24], video saliency dataset (VS) [11], SegTrack v2 dataset (ST2) [23], and DAVIS dataset [30].

Code

Code for AMCT is now available. AMCT+CNN will be released, soon.

Paper Link

Refrences

[9] M. Danelljan, G. H¨ager, F. Khan, and M. Felsberg. Accurate scale estimation for robust visual tracking. In BMVC, 2014.
[10] S. Duffner and C. Garcia. Pixeltrack: a fast adaptive algorithm for tracking non-rigid objects. In ICCV, pages 2480– 2487, 2013.
[11] K. Fukuchi, K. Miyazato, A. Kimura, S. Takagi, and J. Yamato. Saliency-based video segmentation with graph cuts and sequentially updated priors. In ICME, pages 638–641, 2009.
[13] M. Godec, P. M. Roth, and H. Bischof. Hough-based tracking of non-rigid objects. In ICCV, pages 81–88, 2011.
[17] Z. Hong, Z. Chen, C. Wang, X. Mei, D. Prokhorov, and D. Tao. Multi-store tracker (muster): a cognitive psychology inspired approach to object tracking. In CVPR, pages 749–758, 2015.
[22] S. Kwak, T. Lim, W. Nam, B. Han, and J. H. Han. Generalized background subtraction based on hybrid inference by belief propagation and bayesian filtering. In ICCV, pages 2174–2181, 2011.
[23] F. Li, T. Kim, A. Humayun, D. Tsai, and J. M. Rehg. Video segmentation by tracking many figure-ground segments. In ICCV, pages 2192–2199, 2013.
[24] J. Lim and B. Han. Generalized background subtraction using superpixels with label integrated motion estimation. In ECCV, pages 173–187, 2014.
[25] T. Lim, S. Hong, B. Han, and J. H. Han. Joint segmentation and pose tracking of human in natural videos. In ICCV, pages 833–840, 2013.
[30] F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung. A benchmark dataset and evaluation methodology for video object segmentation. In CVPR, pages 724–732, 2016.
[34] J. Son, I. Jung, K. Park, and B. Han. Tracking-bysegmentation using online gradient boosting decision tree. In ICCV, pages 3056–3064, 2015.
[35] Y.-H. Tsai, M.-H. Yang, and M. J. Black. Video segmentation via object flow. In CVPR, pages 3899–3908, 2016.
[36] S. Wang, H. Lu, F. Yang, and M.-H. Yang. Superpixel tracking. In ICCV, pages 1323–1330, 2011.
[38] L. Wen, D. Du, Z. Lei, S. Z. Li, and M.-H. Yang. Jots: Joint online tracking and segmentation. In CVPR, pages 2226– 2234, 2015.
[42] J. Zhang, S. Ma, and S. Sclaroff. Meem: Robust tracking via multiple experts using entropy minimization. In ECCV, pages 188–203, 2014.