Semantic Segmentation Considering Correlation with RGB and Depth Using Convolutional Neural Networks

Ghelichkhan, Zahra; Kasaei, Shohreh

Please enable javascript in your browser.

Semantic Segmentation Considering Correlation with RGB and Depth Using Convolutional Neural Networks

Ghelichkhan, Zahra | 2021

353 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 54581 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Kasaei, Shohreh
Abstract:
In the extensive horizon of artificial intelligence technology, one of the grand challenges in computer vision has been semantic segmentation. This task which aimed to predict label for each pixel of image, describes the scene, due to the need of low level information, is more complicated in comparison with other computer vision tasks. However, as part of concept of scene understanding and a crucial step in many real world applications such as autonomous driving, human-computer interaction and robot navigation, many researchers have been sought to resolve it. What makes this task more challenging rather than other computer vision tasks is that information beyond a pixel, its neighbors and correlation between pixels all around the scene plays a significant role in explore the concept of each pixel. The deep convolutional neural networks which enhance all computer vision tasks’ results are completely successful in exploring relations between close pixels, but they fail in reaching contextual information and the correlation between distant pixels. On the other hand, image segmentation algorithms, aimed to partitioning it into coherent regions, include advantageous information about pieces of objects and the correlation between pixels around the image. In this study, two efficient fully convolutional neural networks are proposed to utilize the output of segmentation networks, named region image, to explore image contextual information and pixels’ correlation after fusing RGB-D images. This purpose has gaind by applying Attention to Region Mechanism in the first proposed method, and Class Region-Aware Semantic Segmentation in the second one. In addition, an exam is designed to assess the proficiency of region images as a complementary modality along with other data. The report of experimental results evaluated by measures including Pixel Accuracy, mean Pixel Accuracy and mean Intersection over Union on two popular indoor RGB-D datasets, NYU-V2 and SUN RGB-D illustrates that the proposed methods have successful performance. For instance, the first and second proposed methods improve results on NYU-V2 dataset by 2.9 and 3.5 percent respectively in terms of mIoU metric
Keywords:
Semantic Segmentation ; Attention Mechanism ; Feature Fusion ; Correlation ; RGB-D Camera ; Class-Aware Semantic Segmentation ; Global Contextual Information

Digital Object List

محتواي کتاب
view

Bookmark

No TOC

Friend's email
Your name
Your email
enter code