Learning Images Across Scales
Using Adversarial Training

ACM Transactions on Graphics (SIGGRAPH), 2024

Krzysztof Wolski1     Adarsh Djeacoumar1     Alireza Javanmardi1     Hans-Peter Seidel1     Christian Theobalt1    
Guillaume Cordonnier2     Karol Myszkowski1     George Drettakis2     Xingang Pan3     Thomas Leimkühler1    

1 MPI Informatik    2Inria, Université Côte d'Azur    3Nanyang Technological University


The real world exhibits rich structure and detail across many scales of observation. It is difficult, however, to capture and represent a broad spectrum of scales using ordinary images. We devise a novel paradigm for learning a representation that captures an orders-of-magnitude variety of scales from an unstructured collection of ordinary images. We treat this collection as a distribution of scale-space slices to be learned using adversarial training, and additionally enforce coherency across slices. Our approach relies on a multiscale generator with carefully injected procedural frequency content, which allows to interactively explore the emerging continuous scale space. Training across vastly different scales poses challenges regarding stability, which we tackle using a supervision scheme that involves careful sampling of scales. We show that our generator can be used as a multiscale generative model, and for reconstructions of scale spaces from unstructured patches. Significantly outperforming the state of the art, we demonstrate zoom-in factors of up to 256x at high quality and scale consistency.




title = {Learning Images Across Scales Using Adversarial Trainings},
author = {Wolski, Krzysztof and Djeacoumar, Adarsh and Javanmardi, Alireza and Seidel, Hans-Peter and Theobalt, Christian and Cordnonnier, Guillaume and Myszkowski, Karol and Drettakis, George and Pan, Xingang and Leimk{\"u}hler, Thomas},
journal = {ACM Transactions on Graphics},
year = {2024},
volume = {43},
number = {4}


The authors thank Bartosz Wojczyński for providing the Milkyway data, as well as Joachim Weickert and Pascal Peter for early discussions. This research was partially funded by the ERC Advanced Grant FUNGRAPH (https://fungraph.inria.fr), No 788065 and an academic gift from Meta.