Fusion of Audio and Visual Occurrences using Fuzzy Logic for Improving Perception Quality of Events

Faraji, Mohammad Mahdi | 2020

  1. Type of Document: Ph.D. Dissertation
  2. Language: Farsi
  3. Document No: 52694 (05)
  4. University: Sharif University of Technology
  5. Department: Electrical Engineering
  6. Advisor(s): Bagheri Shouraki, Saeed
  7. Abstract:
  8. The ability of human to analyze the environment around them has been an inspiring source for event analysis research. Since human perception of the environment is formed in a multi-modal space, many efforts have been made to fuse information to create an intelligent fusion. In this study, we want to better understand the environment using information fusion. For this purpose, fuzzy fusion of audio and video signals based on ink drop spread operator is performed for recognizing and tracking of the targets using several scenarios of the AV16.3 dataset. We then focused on the fusion of audio data for sound source localization, one of the most important applications in evaluating fusion algorithms. For sound source localization in this study, a fuzzy algorithm based on TDOA is proposed at first. In the fuzzy TDOA algorithm, direction of sound source is estimated using the fusion of fuzzy data obtained from the cross correlation between each pair of microphones. The accuracy of direction estimation using the proposed fuzzy algorithm is similar to the well-known beamforming method. However, the computational cost of the proposed fuzzy algorithm is much less than that of for the beamforming method and also it is more robust in confronting with noisy conditions. A fuzzy fusion algorithm for fusing the estimated directions is then introduced in order to estimate the location of the sound source. Due to the inclusion of fuzzy concept in the proposed fusion algorithm, it is robust against noise and also its reasonable computational cost makes it suitable for hardware implementation. For the sake of evaluating the proposed algorithms, a hardware for sensor nodes is designed and built which are then distributed in the environment in order to capture an appropriate database. By using this database, we then localize a flying object as a sound source in the three dimensional wide-range environment
  9. Keywords:
  10. Sound Source Localization ; Fuzzy Logic ; Distributed Sensor Networks ; Beamforming ; Fuzzy Fusion ; Audio and Visual Fusion

