
Designing an Automatic Lip-reading System for Persian Words Using Deep Neural Networks and Implementing it on Rasa Social Robot

Gholipour, Amir | 2022

233 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 54862 (08)
  4. University: Sharif University of Technology
  5. Department: Mechanical Engineering
  6. Advisor(s): Taheri, Alireza; Mohammadzadeh, Hoda
  7. Abstract:
  8. In Iranian Sign Language (ISL), alongside the movement of fingers, the movement of the lips is also essential for to perform words completely and correctly. The purpose of current study is to provide an automated lip-reading system using deep neural networks and implement it on Rasa social robot; So that the robot can recognize a limited number of specified Persian words. To do this, we propose an automated lip-reading system based on convolutional neural networks and long short-term memories. Convolutional neural networks in extracting features from images and long short-term memories in modeling temporal dynamics have achieved good results. We have also recorded a database in Persian language in which there are 50 people who repeat each of the 25 specified words 4 times. The accuracy of the proposed network in this database is 94.4%, which is quite appropriate and acceptable for the ultimate goal of the research, which is the implementation on Rasa social robot. The results of the practical test obtained from the implementation of the proposed network for 5 people are 80.6%, which is a reasonable accuracy. It is worth noting that our proposed network will be trained in a very short time. We have also used the suggested network for recognizing OuluVS2 database words and the accuracy rate is 91.39%. Although our proposed network did not provide the highest accuracy for this database (based on literature), it was able to provide better result than some of the more complex and even pre-trained networks
  9. Keywords:
  10. Convolutional Neural Network ; Long Short Term Memory (LSTM) ; Social Robotics ; Lipreading ; Persian Sign Language ; Automated Lip-Reading

 Digital Object List
