Loading...

Deep Learning for Action Recognition

Aslan Beigi, Fatemeh | 2020

1203 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 52582 (05)
  4. University: Sharif University of Technology
  5. Department: Electrical Engineering
  6. Advisor(s): Vosoughi Vahdat, Bijan; Mohammadzadeh, Narjesolhoda
  7. Abstract:
  8. Computers, laptops, tablets and even cell phones are capable of recording, producing, storing and sharing videos. With the increasing availability of movies and more and easier access to them, the need for understanding videos has increased. Due to the limited human ability in analyzing videos, there is an increasing demand for intelligent systems to analyze videos and recognize the actions in them.Action recognition is the classification of the action performed by the individual in the video, and there are different types of action recognition depending on the nature of the data and the way it will be processed. Vision-based human action recognition is affected by several challenges due to variation in camera view, occlusion, variation in execution rate, anthropometry, camera motion, and background clutter. The purpose of this thesis is to present an effective deep learning approach to model the motion and temporal data in the video sequence that improves action recognition accuracy.Each video consists of a large number of frames and processing all of them is computationally expensive. Consider the keyframe as a frame on a timeline that marks the beginning or end of a smooth transition (the same as keyframe definition in media production). A sequence of keyframes defines which movement the viewer will see, whereas the position of the keyframes on video defines the timing of the movement. As you may consider, keyframes play a much bolder role in delivering the content to the audience than other frames and this highlights the need for an efficient sampling scheme for videos. In this thesis, we propose a novel network for extracting keyframes. Summarizing the video by keyframes not only improves action recognition accuracy but also reduces the computational cost. The proposed network achieved 92.1% accuracy on the UCF-101 dataset which is competitive to the previous architectures
  9. Keywords:
  10. Deep Learning ; Computer Vision ; Keyframes Extraction ; Action Recognition

 Digital Object List

 Bookmark

...see more