3D Reconstruction of Human Pose in Multi-View Dynamic Scenes

Ershadi Nasab, Sara | 2018

979 Viewed
  1. Type of Document: Ph.D. Dissertation
  2. Language: Farsi
  3. Document No: 50830 (05)
  4. University: Sharif University of Technology
  5. Department: Electrical Engineering
  6. Advisor(s): Sanaei, Esmaeil; Kasaei, Shohreh
  7. Abstract:
  8. In this thesis, 3D pose reconstruction of one or multiple humans in multi-view dynamic scene is considered. Inputs are multi-view frames of multi-view camera systems. Outputs are the 3D reconstructed human poses in 3D space. The pose is the location of 14 human body joints in the 3D space. In this research, it is not allowed to use Kinect sensor data or other Markers or GPS sensors. It is supposed that only multi-view images are used. 3D reconstruction of human body pose can be performed with different assumptions. For example, the scene is indoor only or the camera calibration information is provided at first.Cameras are moving or fixed. In this research, each of this assumption is regarded and in each case, a method is proposed to solve the problem.Also multiple human 3D pose reconstruction problem by using only one single view frame (without any relation to the corresponding camera frames.) and without using the camera calibration matrices is considered. Also, this method computes the relates rotation and translation matrices between cameras. For evaluating the proposed method, the probability of correct parts measure (PCP), the percentage of correct key points (PCK) in 2D or 3D, the mean per joints position error (MPJPE) and the measure of correctness in pixel segmentation are considered. Previous methods for estimating the 3D human pose had a lot of challenges for learning the human body shape from the images.Previous methods can only reconstruct the 3D pose of one human in the scene and can not estimate the 3D pose of other people in the scene. Many of previous methods are constrained to use only videos and can not estimate the 3D human pose with only one video fame.Also, previous methods have limited efficacy in 3D pose estimation in terms of 3D PCP, 3D PCK, MPJPE errors. In this thesis, the KTH Football II 2D, KTH Football II, HumanEva I, Campus, Shelf, MPII Cooking and UMPM datasets are used for evaluating the proposed 3D pose estimation method. Camera calibration information is provided by these datasets. The proposed method learns the human body shape more efficiently than the previous methods and also search the human body joints in the discrete space of points. An efficient method is proposed that use the single view images and estimates the appearance of the player. It then improves this estimation step by step. Finally, the input image is segmented into the human body parts and background.Articulated human body pose is modeled with the tree, loopy and fully connected models. By using a conditional random field in 2D and 3D space, and by using a model that is used for modeling the skeleton of the human body, the inference is performed exactly or approximately using the approximate methods. In Fully connected model on image pixels, nodes are image pixels. Every pixel is connected to the other pixels.The energy function that is defined by this graphical model contains unary and pairwise terms. Pairwise terms are according to the shifted Gaussian function. A huge number of connections makes between the graph nodes makes the inference very time-consuming. In this research,an efficient method is proposed that solve the inference step rapidly.In the human model with 14 joints and fully connected edges between joints, by using a conditional random field in 3D space and performing the inference by belief propagation method, multiple human 3D poses in the scene are reconstructed. Proposed methods in this thesis have improvement in PCP and PCK measures. The proposed method has 10:48% improvement in Shelf dataset, 6:07% improvement in Campus dataset, 16:02% improvement in UMPM dataset. It has 7:84% improvement in KTH Football II 2D. The proposed method has 1:57% improvement in Human3.6M in Photo sequence and 1:07% in Discuss sequence in terms of MPJPE measure. Also, the proposed method has 10:4% improvement in MPII Cooking dataset in camera 1 and 9:6% improvement in camera 2 in terms of 2D PCP measures
  9. Keywords:
  10. Inference ; Three Dimensional Reconstruction ; Three Dimensional State ; Posture ; Human Body ; Graphic Model ; Calibration Parameters

 Digital Object List


...see more