Abstract

This paper studies domain generalization via domain-invariant representation learning. Existing methods in this direction suppose that a domain can be characterized by styles of its images, and train a network using style-augmented data so that the network is not biased to par-ticular style distributions. However, these methods are restricted to a finite set of styles since they obtain styles for augmentation from a fixed set of external images or by interpolating those of training data. To address this limitation and maximize the benefit of style augmentation, we propose a new method that synthesizes novel styles constantly during training. Our method manages multiple queues to store styles that have been observed so far, and synthesizes novel styles whose distribution is distinct from the distribution of styles in the queues. The style synthesis process is formulated as a monotone submodular optimization, thus can be conducted efficiently by a greedy algorithm. Extensive experiments on four public benchmarks demonstrate that the proposed method is capable of achieving state-of-the-art domain generalization performance.

Figure 1. The motivation of our method. We improve the generalization ability of a model by adaptively synthesizing novel styles that are distinct not only from source domain styles but also from those synthesized previously, then injecting them into training data for learning style-invariant representation.

Overall pipeline of Style Neophile

Figure 2. (1) For each iteration of training, source styles are computed from the source domain images by a network and enqueued into the source styles queues (if the number of stored styles are exceed the limit, the styles are dequeued from the oldest). (2) Source style prototypes that well represent the style distributions of the source style queues are selected. (3) Candidates of novel styles are generated by jittering the source styles with random noises. (4) We select novel styles not well represented by both prototypes of source domains and previous novel styles. (5) Selected novel styles are enqueued in novel style queues. Steps (2)-(5) are executed every predefined number of iterations to constantly seek novel styles.

Experimental results

1. Performance comparison for image classification

Table 1. Leave-one-domain-out generalization results for image classification in top-1 accuracy (%) on each dataset. (a) Evaluation on PACS. (b) Evaluation on OfficeHome. (c) Evaluation on DomainNet. The best results are in bold.

2. Performance comparison for instance retrieval

Table 2. Generalization results on cross-domain person re-ID in mean average precision and ranking accuracy. Results are evalulated on the Market1501 and DukeMTMC-reID (Duke) datasets.

3. Ablation studies of our method

Table 3. Ablation studies on each component of our method on PACS with ResNet18. * denotes both s2o and o2s.

4. Analysis on the diversity of novel styles

Figure 3. Empirical analysis on the diversity of styles synthesized by MixStyle and ours with ResNet18 on PACS. (a) Channel-wise deviations of synthetic styles. (b) Squared MMD between source and synthetic styles. (c) t-SNE visualization of source and synthetic styles.

5. Qualitative results

Figure 4. Qualitative analysis of novel styles. (a) t-SNE visualization of source and novel styles, which are computed from the feature maps of the 1st and 2nd residual blocks of ResNet18 while being trained on PACS. (b) Examples of PACS images of source style prototypes. (c) Examples of ImageNet images whose styles are closest to source style prototypes (b) in style space. (d) Examples of ImageNet images whose styles are closest to novel styles in the style space.

Acknowledgements

This work was supported by Samsung Research Funding & Incubation Center of Samsung Electronics under Project Number SRFC-IT1801-05 and Samsung Electronics Co., Ltd (IO201210-07948-01)

Paper

Style Neophile: Constantly Seeking Novel Styles for Domain Generalization
Juwon Kang, Sohyun Lee, Namyup Kim, and Suha Kwak
CVPR, 2022
[Paper] [Bibtex]