[论文笔记]PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models

Title

PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models

Information

论文地址：https://arxiv.org/abs/2003.03808

github地址：https://github.com/adamian98/pulse

Summary

作者用高斯分布的约束去初始化latent code，用downsampling loss去优化latent code，使得生成的高分辨率图像在降采样后，与输入的低分辨率图像一致。优点是：生成图像真实，且和输入图像有对应关系。

Research Objective

将模糊或分辨率低的图片转换成真实清楚的高分辨率图像（超分技术）

Problem Statement

之前的方法

监督学习；
（趋势一）通常用SR和HR间的像素平均距离（如MSE）训练，这会导致生成的SR图像更平滑，感知相关的细节（如纹理）被忽视，细节模糊。
（趋势二）跳出pixel-wise的框架，尝试用感知距离来训练。比较多的是使用GAN判断真假（趋势一的loss加入了图片的真实度判断）。

之前学术界一直认为超分和图像生成是两个领域，超分是修正。但实际上，超分为了细节清晰，是需要引入生成的

Method(s)

作者认为超分的关键是找到那些真实的、下采样正确的图像。

image

Loss

提出Downscaling Loss，计算生成的HR Image降采样之后和原始LR Image间的距离。
Latent

生成网络的latent是有趋向性的，根据训练集，某些属性的可能性多，生成结果好，某些属性的可能性少，有可能导致伪影或不真实。基于这一情况，作者针对latent code增加约束项：Gaussian prior，使得latent code的概率分布能符合生成网络预期。

Evaluation

styleGAN使用karras开源的ffhq预训练网络
对于每张图像，使用100次球状梯度下降，学习率0.4，latent随机初始化
数据：CelebA HQ（因为CelebA分辨率低）
用来比较的SOTA：用CelebA HQ训练的，因为用FFHQ训练的效果非常差
qualitative comparison：bicubic upsampling, FSRNet, FSRGAN，pulse(×8), pulse(×64)
quantitative comparison

这个地方值得重点关注一下，因为之前的SOTA都只能实现8倍的重建（重建后的分辨率是128×128），因此定量比较都是基于128*128分辨率的。找了40个人对240张图片进行1-5评分，证明pulse的评分最高，为3.6.

Naturalness Image Quality Evaluator (NIQE)：在高分辨率上有效果，因此无法和FSRNet和FSRGAN比较。和Nearest和Bicubic方法比较，pulse的NIQE最低（越低越好）

Conclusion

虽然本文侧重于人脸，但方法可以应用在医学、天文学、显微镜和卫星图像。有了这项技术，可以解除硬件、内存等的限制。
隐含降噪功能

Notes

超分常用的评估标准：
PSNR（有时过高的PSNR会导致图像模糊）
Naturalness Image Quality Evaluator (NIQE)

Reference

CNN LR-HR：

[8]Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. Learning a deep convolutional network for image super-resolution.

[20]Wenzhe Shi, Jose Caballero, Ferenc Huszar, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. Real-time single image and video super-resolution using an efﬁcient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.

[15] SRGAN：Christian Ledig, Lucas Theis, Ferenc Husz´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photorealistic single image super-resolution using a generative adversarial network

[7] face super-resolution的SOTA：

对stylegan的latent做Gaussian prior

[5]Compressed sensing using generative models

mean-opinion-score (MOS) test （perceptual super-resolution literature）：

[15]Christian Ledig, Lucas Theis, Ferenc Husz´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photorealistic single image super-resolution using a generative adversarial network

[13]Deokyun Kim, Minseon Kim, Gihyun Kwon, and Dae-Shik Kim. Progressive face super-resolution via attention to facial landmark

Naturalness Image Quality Evaluator (NIQE)

[17]A. Mittal, R. Soundararajan, and A. C. Bovik. Making a completely blind image quality analyzer