HEVC Compressed Domain Computer Vision

Alizadeh, Mohammad Sadegh; Sharifkhani, Mohammad

Please enable javascript in your browser.

HEVC Compressed Domain Computer Vision

Alizadeh, Mohammad Sadegh | 2019

586 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 52606 (05)
University: Sharif University of Technology
Department: Electrical Engineering
Advisor(s): Sharifkhani, Mohammad
Abstract:
In the first section, a novel No-Reference Video Quality Assessment (NR-VQA) method, based on Convolutional Neural Network (CNN) for High Efficiency Video Codec (HEVC) is presented. Deep Compressed-domain Video Quality (DCVQ) measures the video quality, with compressed domain features such as motion vector, bit allocation, partitioning and quantization parameter. For the training of the network, normalized PSNR is used due to the limitation of existing datasets. The evaluation of the proposed method shows that it has”96%” correlation to subjective quality assessment (MOS). The method can work simultaneously with the decoding process and measures the quality each frame in the different resolution. The second section aims to present a novel accurate moving object detection method based on Conditional Random Field (CRF) for HEVC/H.265 compressed domain video sequences. For each block, the number of consumed bits, motion vectors (MV) and partitioning modes for a given block are extracted from the compressed bitstream. After removing outlier MVs, compensating MVs are assigned to the I-blocks based on their neighboring blocks. The information such as MV, partitioning mode and bit consumption is used in the potential functions of a CRF model which is updated for every frame to detect the objects. Then, a number of standard test video sequences are used to verify the performance of the model. The results indicated that the model can offer the precision, which is more than 90% on average for the video sequences. The proposed method offers a 1.8 speedup, compared to the latest works in the compressed domain without losing the objects in the I-frames.In the last section, the key research challenge is developing effective backbone networks that can directly take data in the compressed domain as input. Our baseline is to take models developed for action understanding in the decoded domain and adapt them to attack the same tasks in the compressed domain. Motion cues have been shown to be important for action understanding, but the motion vectors in compressed video are often very noisy and not discriminative enough for directly performing accurate action understanding. We develop a new and highly efficient framework that can learn to predict based on noisy motion vectors s in the compressed video streams. On the action recognition benchmark, namely UCF101, we demonstrate that our Network can significantly shorten the performance gap between state-of-the-art compressed video based methods. By addressing the three major challenges mentioned above, we are able to develop more robust models for video understanding and improve performance in compressed domain. Our research has contributed significantly to advancing the state of the art of compressed domain video analyzing semantic understanding of video content
Keywords:
Machine Vision ; Action Recognition ; High Efficient Video Coding (HEVC) ; Video Quality ; Compress Domain

Digital Object List

محتواي کتاب
view

Bookmark

No TOC

Friend's email
Your name
Your email
enter code