Soumitra Samanta

HUMAN POSE TRACKING

Human pose tracking is an important problem in computer vision due to its application in human action recognition and surveillance from video data. Visual appearance of any human action is a sequence of various human poses. We propose that if we track those poses, then human action could be classified accurately. In this project, we present a human pose tracking method with a new part descriptor. We formulate the human pose tracking problem as a discrete optimization problem based on spatio-temporal pictorial structure model and solve this problem in dynamic programming framework very efficiently. We propose the model to track the human pose by combining the human pose estimation from single image and traditional object tracking in a video. Our pose tracking objective function consists of the following terms: likeliness of appearance of a part within a frame, temporal displacement of the part from previous frame to the current frame, and the spatial dependency of a part with its parent in the graph structure. Experimental evaluation on benchmark datasets (VideoPose2, Poses in the Wild and Outdoor Poses) as well as on our newly build ICDPose dataset shows the usefulness of our proposed method.

PUBLICATION

Soumitra Samanta and Bhabatosh Chanda, A Data-driven Approach for Human Pose Tracking Based on Spatio-temporal Pictorial Structure , [arXiv] [dataset]

COMPARATIVE RESULTS

Visual comparative results of different methods on VideoPose2 dataset. Four columns indicates the four consecutive frames of a video clips and rows indicates the different methods. First row show the ground truth part annotation. Results using Lara et al. [37], Zhang et al. [48], Sapp et al. [35], Park et al. [25], Cherian et al. [6] and proposed method are in 2nd, 3rd, 4th, 5th, 6th and 7th rows respectively.

Visual comparative results of different methods on Poses in the Wild dataset. Four columns indicates the four consecutive frames of a video clips and rows indicates the different methods. First row show the ground truth part annotation. Results using Lara et al. [37], Zhang et al. [48], Sapp et al. [35], Park et al. [25], Cherian et al. [6] and proposed method are in 2nd, 3rd, 4th, 5th, 6th and 7th rows respectively.

Visual results of different methods on ICDPose dataset. Four columns indicates the four consecutive frames of a video clips. First row shows the ground truth part annotation. Results due to Lara et al. [37], Zhang et al. [48], Sapp et al. [35], Park et al. [25], Cherian et al. [6] and proposed method are shown in 2nd, 3rd, 4th, 5th, 6th and 7th rows respectively.

HUMAN POSE TRACKING

PUBLICATION

COMPARATIVE RESULTS

RELATED PAPERS