ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation
来源:CVPR 2019
作者:Tuan-Hung Vu,Himalaya Jain,Maxime Bucher,Matthieu Cord, Patrick P´erez
实现环境:pytorch,All experiments are done on a single NVIDIA 1080TI GPU with 11 GB memory.
数据集:SYNTHIA,GTA5作为原域,Cityscapes dataset作为目标域
To our knowledge, we are first to successfully apply entropy based UDA training to obtain competitive performance on semantic segmentation task。
two proposed approaches for entropy minimization using
(i) an unsupervised entropy loss
(ii) adversarial training.
1 动机
直接方法: 最小化一个熵损失
2 主体
we introduce a unified adversarial training framework which indirectly minimizes the entropy by having target’s entropy distribution similar to the source.
To this end, we formulate the UDA taskas minimizing distribution distance between source and targeton the weighted self-information space.
Our adversarialapproach is motivated by the fact that the trained model naturallyproduces low-entropy predictions on source-like images.By aligning weighted self-information distributionsof target and source domains, we indirectly minimize theentropy of target predictions.
对抗方法总结下来就是在weighted self-information space上,让目标域样本的分布尽可能的接近原域样本的分布,这样做的原因是,模型对于像原域样本的输入,会产生熵比较低的预测。这样算是间接减小了熵。
本文用Deeplab-V2 [2] as the base semantic segmentation architecture F.
To better capture the scene context, Atrous Spatial Pyramid Pooling (ASPP) is applied on the last layer’s feature output.
We experiment on the two different base deep CNN architectures: VGG-16 and ResNet-101.
The adversarial network D introduced in Section 3.2 has the same architecture as the one used in DCGAN
Adversarial training for UDA is the most explored approach for semantic segmentation. It involves two networks.One network predicts the segmentation maps for the input image, which could be from source or target domain, while another network acts as a discriminator which takes the feature maps from the segmentation network and triesto predict domain of the input. The segmentation network tries to fool the discriminator, thus making the features fromthe two domains have a similar distribution
Some methods build on generative networks to generate target images conditioned on the source. Hoffman et al. [14] propose Cycle-Consistent Adversarial Domain Adaptation (CyCADA), in which they adapt at both pixel-level and feature-level representation. For pixel-level adaptation they use Cycle-GAN [48] to generate target images conditioned on the source images.
2.CL approach
The authors in [50] use generative adversarial networks (GAN) [11] to align the source and target embeddings.Also, they replace the cross-entropy loss by a conservative loss (CL) that penalizes the easy and hard cases of source examples.
The CL approach is orthogonal to most of the UDA methods, including ours: it could benefit any method that uses cross-entropy for source.
[50] Penalizing top performers: Conservative loss for semantic segmentation adaptation. In ECCV, September 2018. 3
Another approach for UDA is self-training. The idea is to use the prediction from an ensembled model or a previous state of model as pseudo-labels for the unlabeled data to train the current model. Many semi-supervised methods [20, 39] use self-training. In [51], self-training is employed for UDA on semantic segmentation which is further extended with class balancing and spatial prior。
Pseudo-labeling is a simple yet efficient approach for semi-supervised learning [21]. Recently, the approach has been applied to UDA in semantic segmentation task with an iterative self-training (ST) procedure [51].
