Abstract
Our goal is to learn a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping F : Y → X and introduce a cycle consistency loss to enforce F(G(X)) ≈ X (and vice versa).
1. Introduction
本文做了什么:
In this paper, we present a method that can learn to do the same: capturing special characteristics of one image collection and figuring out how these characteristics could be translated into the other image collection, all in the absence of any paired training examples.
- 包括obtaining paired training data的缺点,本文算法优点We therefore seek an algorithm that can learn to translate between domains without paired input-output examples
- without paired的方法缺点:在The optimal G thereby translates the domain X to a domain Yˆ distributed identically to Y时
(1)such a translation does not guarantee that an individual input x and output y are paired up in a meaningful way
(2)it difficult to optimize the adversarial objective in isolation
- 解决方法: adding more structure
if we have a translator G : X → Y and another translator F : Y → X, then G and F should be inverses of each other, and both mappings should be bijections ;adding a cycle consistency loss [64] that encourages F(G(x)) ≈ x and G(F(y)) ≈ y
2. Related work
(1)GAN We adopt an adversarial loss to learn the mapping such that the translated images cannot be distinguished from images in the target domain.
(2)Image-to-Image Translation Our approach builds on the “pix2pix” framework of Isola et al. [22], which uses a conditional generative adversarial network [16] to learn a mapping from input to output images.本文与前面的方法不同之处we learn the mapping without paired training examples.(提了好几遍)
(3)Unpaired Image-to-Image Translation 之前也有一些方法也是 unpaired setting。而且这些方法 also use adversarial networks, with additional terms to enforce the output to be close to the input in a predefined metric space。但是本文的方法our formulation does not rely on any task-specific(特定的), predefined similarity function(预定义相似函数) between the input and output, nor do we assume that the input and output have to lie in the same low-dimensional embedding space.
(4)Cycle Consistency . In this work, we are introducing a similar loss to push G and F to be consistent with each other
(5)Neural Style Transfer
3. Formulation
In addition, we introduce two adversarial discriminators DX and DY , where DX aims to distinguish between images {x} and translated images {F(y)}; in the same way, DY aims to discriminate between {y} and {G(x)}. Our objective contains two types of terms: adversarial losses [16] for matching the distribution of generated images to the data distribution in the target domain; and cycle consistency losses to prevent the learned mappings G and F from contradicting each other.
3.1. Adversarial Loss
3.2. Cycle Consistency Loss
adversarial losses alone cannot guarantee that the learned function can map an individual input xi to a desired output yi .
3.3. Full Objective