Loading...
Temporal Action Localization Using Recurrent Neural Networks
Keshvari Khojasteh, Hassan | 2021
285
Viewed
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 53737 (05)
- University: Sharif University of Technology
- Department: Electrical Engineering
- Advisor(s): Behroozi, Hamid; Mohammadzadeh, Narjesolhoda
- Abstract:
- Action recognition is one of the important tasks in computer vision that detects the action label in videos that contain only one action. In recent years, action recognition has attracted much attention and researchers have tried to solve it by different approaches.Action recognition by itself does not have many applications in the real world because videos are untrimmed and do not contain only one action. So Temporal Action Localization(TAL) task in which we want to predict the start and end time of each action in addition to the action label has a lot of applications in the real world and for this reason, TAL is a hot research topic. But due to its complexity, researchers have not reached great results compared to the Action recognition task. Its complexity is related to predicting precise start and end time for different actions in videos.Most recent researches use the Convolution Neural Networks(CNNs) while Recurrent Neural Networks(RNNs) are suitable to process different time-series data such as text, video and etc because they have some kind of memory. So in this thesis, we have used Recurrent network GRU v3 to design our network, and specifically, we have proposed three methods to improve our approach. In the first method, we have split network output in every time step into three parts and processed each one separately. In the second one, by using linear interpolation we have computed precisely the start and end times of each action, and finally, by borrowing an idea that is used in Learn to Rank task we have ranked efficiently the predicted proposals in comparison to previous methods.To evaluate the performance of our proposed network, based on three metrics AR@AN, R@AN = 100-tIoU, and mAP, we compared the results obtained by our network with the results of previous methods and we have shown the superiority of our network over them. Specifically, in comparison to RecapNet in terms of AR@AN and R@AN = 100-tIoU metrics with 50 average number of proposals and 0.5 intersection we had 6.62% and 6.91% improvements respectively. Also in comparison to TAS-Net in terms of mAP metric with 0.7 intersection we had 5.82% improvement
- Keywords:
- Action Recognition ; Deep Learning ; Recurrent Neural Networks ; Temporal Action Localization ; Markov Chain
-
محتواي کتاب
- view