Interaktivní osnova
[Mikuláš Bankovič] Single Image Super Resolution: SRCNN and ESPCN 22. 10. 2020
Readings and literature
Pick one piece to read in advance, ordered by relevance:
- SRCNN - https://arxiv.org/pdf/1501.00092.pdf
- ESPCN - https://arxiv.org/pdf/1609.05158.pdf
- ESRGAN - https://arxiv.org/pdf/1809.00219.pdf
- SRResNet - https://arxiv.org/pdf/1609.04802.pdf
- DRN - https://arxiv.org/abs/1802.08797
I will not go into details about GAN and ResNet, so if you know nothing about them, you are free to check these articles or some other useful resource:
Abstract
Many of you might have heard about dog versus cat image classification or semantic segmentation tasks mostly because of the rise of deep neural and deep convolutional neural networks. However, a little bit in a shadow, there are more applications of these networks, such as super-resolution and upscaling, which can significantly impact the world.
The super-resolution (SR) aims to recover a high-resolution (HR) image from a low-resolution (LR) image. The traditional algorithm pipeline consisted of multiple stages: cropping patches, encoding to LR dictionary, mapping from LR do HR dictionary, and aggregation of HR patches, while only some of these steps were primarily optimized.
Given by Chao Dong [1], all of these operations can be modeled by Convolutional Neural Network (CNN). The pipeline then consists of CNN and provides end-to-end learning, optimizing all partial steps along the way. It is also possible to fine-tune these models to a specific domain, which can be a lot harder with traditional algorithms and require designing specific mappings and encodings.
We will look at the first CNN algorithm for super-resolution and its evolution into the modern algorithms with the addition of ResNet and GAN. We will discuss speed performance and real-time usage of these algorithms.