Video Super-Resolution Using Machine Learning

Ashoori, Mohammad Hossein; Amini, Arash Marvasti, Farrokh

Please enable javascript in your browser.

Video Super-Resolution Using Machine Learning

Ashoori, Mohammad Hossein | 2022

388 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 55072 (05)
University: Sharif University of Technology
Department: Electrical Engineering
Advisor(s): Amini, Arash; Marvasti, Farrokh
Abstract:
Super-resolution means increasing the resolution so that the quality improves. This is defined for both image and video. In this regard, machine-learning based methods, especially convolutional neural networks, have shown great potential in recent years. Finding the right structure that can deliver high speed and accuracy is the key to solving the super-resolution problem.Despite the myriad of methods for image super-resolution, less attention has been paid to its generalization to video. This generalization should be such that more detail is created in the output using adjacent frames.In this dissertation, the existing methods for image and video super-resolution are reviewed, and then a new structure is proposed. By adjusting the hyperparameters of the proposed structure, three networks with a number of different parameters are trained and their quantitative and qualitative results are compared with existing methods.Although according to some quantitative criteria, the proposed networks do not provide better results than the existing methods, in terms of output image quality, they have acceptable performance
Keywords:
Super Resolution ; Convolutional Neural Network ; Machine Learning ; Separable Convolutional Layer ; Batch Normalization (BN)Layer Network ; Image Quality

Digital Object List

محتواي کتاب
view

Bookmark

فهرست مطالب
مقدمه
پیشینه‌ی پژوهشی
- مقدمه
- ساختار شبکه‌ی فراتفکیک‌پذیری تصویر
  - شبکه‌ی SRCNN dong2014learning
  - شبکه‌ی FSRCNN dong2016accelerating
  - شبکه‌ی EDSR lim2017enhanced
  - شبکه‌ی RCAN zhang2018image
  - ارزیابی شبکه‌های مختلف در حوزه‌ی فراتفکیک‌پذیری تصویر
- تابع تلف و خروجی مطلوب
  - توابع تلف پیکسلی
    - تابع تلف L2
    - توابع تلف L1
  - توابع تلف ادراکی
    - تابع تلف بازسازی نقشه‌ویژگی Feature Map Reconstruction Loss Function
    - شبکه‌های GANGenerative Adversarial Network
  - معیار‌های مقایسه‌ی مدل‌های مختلف
    - نرخ بیشینه‌ی سیگنال به نویز PSNR Peak Signal to Noise Ratio
    - شاخص شباهت ساختاری SSIM Structural Similarity Index Measure
- ساختار شبکه‌ی فراتفکیک‌پذیری ویدیو
  - نحوه‌ی دادن ورودی‌ها به شبکه
    - شبکه‌های RNN
    - شبکه‌های Feed-Forward
  - جبران حرکت بین فریم‌ها
    - شار‌نوری
    - کانولوشن تغییرشکل‌پذیر
  - معماری چند شبکه‌ی مهم فراتفکیک‌پذیری ویدیو
    - شبکه‌ی RBPN haris2019recurrent
    - شبکه‌ی DUF jo2018deep
    - شبکه‌‌های BasicVSR ، IconVSR chan2021basicvsr و BasicVSR++ chan2021basicvsr++
- ایده‌هایی برای بهبود عملکرد شبکه‌های کانولوشنی
  - لایه‌های کانولوشنی جدایی‌پذیر، لایه‌های کانولوشنی گروهی و عمقی، بهینه‌تر کردن شبکه
  - مکانیزم توجه به کانال
  - روش Batch Normalization
روش پیشنهادی
- مقدمه
- ساختار‌کلی
- بخش‌های تشکیل‌دهنده‌ی شبکه
  - استخراج اولیه‌ی ویژگی‌ها
  - بخش اصلی شبکه
  - ماژول Upsampling
  - یادگیری باقی‌مانده
نتایج
- مقدمه
- مجموعه داده
- فضاهای رنگی‌
- پیاده‌سازی‌
- مقایسه با روش‌های برتر
- تاثیر Augmentation در مرحله‌ی تست ‌
- مطالعه‌ی Ablation ‌
  - حذف ساختار گروهی بلوک‌ها
  - تاثیر مکانیزم توجه به کانال ‌
  - تاثیر Batch Normalization‌
نتیجه‌گیری و کارهای آینده
- مقدمه
- جمع‌بندی مطالب
- پیشنهادات
مراجع

Friend's email
Your name
Your email
enter code