Loading...

Real-Time Hand Pose Estimation Using Camera Vision System

Kiani, Mahmoud | 2022

218 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 54816 (05)
  4. University: Sharif University of Technology
  5. Department: Electrical Engineering
  6. Advisor(s): Hashemi, Matin; Namvar, Mehrzad
  7. Abstract:
  8. Hand pose estimation is something that has applications in many fields, including augmented and virtual reality systems, as well as mixed reality. Hand gesture recognition and classification applications including sign language recognition and non-handheld senarios (such as storefront contactless Survey systems) that have found special cases in the Qovid-19 pandemic period Shows the highest hand pose estimation importance. In our work, we target 2D and 3D estimation at the same time and also use RGB camera as a sensor to record input data. It becomes more economical to achieve Compared with using RGBD or depth sensors . There is only one RGB image in our input. Also there is no contract to model the position of the hand, but we have used 21 key-point hand model (including 4 key points of the joints for each finger and one point for the wrist) that are mostly used in articles and research. In this dissertation, in order to estimate the 2D and 3D position of the hand, the our HOP-Net network is introduced, which consist of three main parts. First, after separating the hand from the other parts of the image (using HandSegNet network), separated hand ROI enters the network as input. In the first, 2D coordinates of key points are estimated, and then in the second part, these 2D estimations are revised according to hand kinematic information, and finally, 3D estimations are determined with a U-Net like structure. We use convolutional layers in the layers of different parts of our networks and for the first time we have used them adaptively (with a trainable graph proximity matrix). In our work, the U-Net structure is used in a different way, and we have added layers of pooling to it, in a way that is learnable, which helps the network to better learn the task of 3D estimation. In the next part of our work we use 3D hand pose estimations to implement one of the applications of hand position estimation, namely skelton based action recognition, and by providing two structures of fixed and adaptive convolutional graphs to solve the corresponding problem. We have also compared our work with state of the art works in both hand position estimation and skeleton based action recognition, and demonstrated the efficiency and superiority of our work
  9. Keywords:
  10. Graph Convolutional Networks ; Long Short Term Memory (LSTM) ; Hand Action Recognition ; Adpative Graph ; Hand Action Recognition ; Graph Pooling ; Graph Unpooling

 Digital Object List

 Bookmark

No TOC