Figure 1. Various graphical models for visual tracking. A node corresponds to a frame whose color indicates the type of target variation in the node. The temporal order of frames is given by the alphabetical order. (a) is conventional chain model, while (b),(c) and (d) represent the proposed graphical models.

Although visual tracking problem has been studied extensively for decades, the underlying graphical model of most existing probabilistic tracking algorithms is limited to linear structure, i.e., first-order Markov chain (Figure 1a). In this model, the video is represented by chain of temporally ordered frames with assumptions on smooth variation of the target between two consecutive frames. When the assumption does not hold due to the challenges in real-world videos, e.g. fast motion, shot change, etc., tracking may fail on few challenging frames and the failures would be propagated to the rest of frames.

To resolve such issues and achieve persistent tracking in real-world videos, we believe that more general representation of the video is necessary, and developed various graphical models beyond chain model as follows.

  • Bayesian model avaraging (figure 1b): As the first step, we proposed an offline tracking method that tracks easy-to-track frames first and challenging frames later. In this model, the target posterior of new frame is estimated by propagating posteriors from all tracked frames and marginalizing them by Bayesian model averaging.

  • Tree-structured graphical model (figure 1c): The above model exploits density propagations from all tracked frames, which may include noisy propagations from non-relevant frames. To resolve such issue, we proposed another tracking algorithm based on tree-structured representaion of a video. In this model, the posterior is propagated between only relevant frames by organizing a branch with frames having similar target appearance.

  • Weightred model averaging (figure 1d): To enjoy benefits of both model averaging and tree-structured graphical model, we proposed the general-graph based model. In this model, the target posterior is propagated from multiple previous frames where each propagation is weighted by their tracking plausibility. The algorithm runs online by progresively constructing a graph during tracking.


alt text 

alt text 

alt text 

Datasets and Results