Paper Reading @Summer2021: Restoration & Creation from Correspondence
Published:
为组里CVPR论文阅读活动准备的报告,从响应关系的角度讨论了不同抽象程度的视觉响应在不同任务中的引入好处和应用方式。
Visual Correspondence: Sparse or Dense
- Cross-view: sparse correspondence
- Adjacent frame: dense optical flow
- Cross-domain semantic correspondence
(Neural Best-buddies, Aberman et al. in ACM SIGGRAPH 2018)
Why to incorporate additional inputs/ use correspondence
Make the Task Easier
(Robust flash deblurring, Zhuo et al. in CVPR 2010)
(Scene Completion Using Millions of Photographs, Hays et al. in ACM SIGGRAPH 2007)
Control the Results
(Visual Attribute Transfer through Deep Image Analogy, Liao et al. in ACM SIGGRAPH 2017)
(Cross-domain Correspondence Learning for Exemplar-based Image Translation, Zhang et al. in CVPR 2020)
Naturally Available/Inevitable
(Across Scales & Across Dimensions: Temporal Super-Resolution using Deep Internal Learning, Zuckerman et al. in ECCV 2020)
(Light Field Super-Resolution with Zero-Shot Learning, Cheng et al. in CVPR 2021)
How to incorporate correspondence?
- Implicit Usage(latent code)
- Explicit Usage(warp field)
- Multimodality
Paper #1: Swapping Autoencoder for Deep Image Manipulation – NeurIPS 2020
Target: Controllable Image Manipulation Conditional Generation needs additional prior(edge/layout) Finding semantically meaningful latent code is non-trivial
Key idea: image swapping as a pretext task
Approach
- Structure-texture disentangled embedding space
- Co-occurrence patch discriminator
- Style Latent code & Modulation
- Co-occurrence Patch discriminator
Background of Modulation: How to interact between two latent space/integrate conditional signal?
Origin:
Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization, Huang et al. ICCV 2017
FiLM: Visual Reasoning with a General Conditioning Layer, Perez et al. AAAI 2018
Application:
Recovering Realistic Texture in Image Super-resolution byDeep Spatial Feature Transform, Wang et al. CVPR 2018
Guided Image-to-Image Translation with Bi-Directional Feature Transformation, AlBahar and Huang, ICCV 2019
SEAN: Image Synthesis with Semantic Region-Adaptive Normalization, Zhu et al. CVPR 2020
Analyzing and Improving the Image Quality of StyleGAN, Karras et al. CVPR 2020
Adaptive Convolutions for Structure-Aware Style Transfer, Chandran et al. CVPR 2021
Results
Summary
- New application with an old technique
- Cooperation with other designs
Paper #2: Adaptive Convolutions for Structure-Aware Style Transfer, Chandran et al. CVPR 2021
Target: Few-shot Synthesis(Hard to train or adapt) Key idea: Transfer relationships/similarities
Approach
Pairwise similarity Constraint
(1) Cross-domain consistency loss L_dist aims to preserve the relative pairwise distances between source and target generations. In this case, the relative similarities between synthesized images from z_0 and other latent codes are encouraged to be similar. (2) Relaxed realism is implemented by using two discriminators, D_img for noise sampled from the anchor region (z_anch) and D_patch otherwise.
Takeaways
- What? Sparse vs. Dense; RGB to Semantics to Cross-Modality
- Why? Easier/Controllable/Available/Inevitable
- How? Explicit to Implicit
- Implicit Usage
- Latent code + Modulation/Normalization
- Constraints on “2-order” relationships
- Weighting/Attention Mechanism and more
More papers
StyleGAN2 Distillation for Feed-forward Image Manipulation, Viazovetskyi et al. ECCV20
Controlling Style and Semantics in Weakly-Supervised Image Generation, Pavllo et al. ECCV 2020
COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder, Saito et al. ECCV 2020
Example-Guided Image Synthesis using Masked Spatial-Channel Attention and Self-Supervision, Zheng et la. ECCV 2020
Online Exemplar Fine-Tuning for Image-to-Image Translation, Kang et al. ArXiv 2020
Swapping Autoencoder for Deep Image Manipulation, Park et al. NeurIPS 2020
Conditional Generative Modeling via Learning the Latent Space, Ramasinghe et al. ICLR 2021
Semantic Layout Manipulation with High-Resolution Sparse Attention, Zheng et al. CVPR 2021
Spatially-Invariant Style-Codes Controlled Makeup Transfer, Deng et al. CVPR 2021
Adaptive Convolutions for Structure-Aware Style Transfer, Chandran et al. CVPR 2021
ReMix: Towards Image-to-Image Translation with Limited Data, Cao et al. CVPR 2021
Learning Semantic Person Image Generation by Region-Adaptive Normalization, Lv et al. CVPR 2021 Spatially-Adaptive Pixelwise Networks for Fast Image Translation, Shaham et al. CVPR 2021
Bi-level Feature Alignment for Versatile Image Translation and Manipulation, Zhan et al. ArXiv 2021
Controllable Person Image Synthesis with Spatially-Adaptive Warped Normalization, Zhang et al. ArXiv 2021
Sketch Your Own GAN, Wang et al. ICCV 2021