Embedding Transfer with Label Relaxation for Improved Metric Learning

Abstract

This paper presents a novel method for embedding transfer, a task of transferring knowledge of a learned embedding model to another. Our method exploits pairwise similarities between samples in the source embedding space as the knowledge, and transfers them through a loss used for learning target embedding models. To this end, we design a new loss called relaxed contrastive loss, which employs the pairwise similarities as relaxed labels for intersample relations. Our loss provides a rich supervisory signal beyond class equivalence, enables more important pairs to contribute more to training, and imposes no restriction on manifolds of target embedding spaces. Experiments on metric learning benchmarks demonstrate that our method largely improves performance, or reduces sizes and output dimensions of target models effectively. We further show that it can be also used to enhance quality of self-supervised representation and performance of classification models. In all the experiments, our method clearly outperforms existing embedding transfer techniques.

Relaxed Contrastive Loss

Figure 1. Relaxed contrastive loss exploits pairwise similarities between samples in the source embedding space as relaxed labels, and transfers them through a contrastive loss used for learning target embedding models.

Comparison with the Existing Embedding Transfer Methods

Figure 2. Accuracy in Recall@1 on the three standard benchmarks for deep metric learning. All embedding transfer methods adopt Proxy-Anchor (PA) with 512 dimension as the source model. Our method achieves the state of the art when embedding dimension is 512, and is as competitive as recent metric learning models even with a substantially smaller embedding dimension. In all experiments, it is superior to other embedding transfer techniques.

Quantitative Results

1. Comparisons with embedding transfer techniques

Table 1. Image retrieval performance of embedding transfer and knowledge distillation methods in the three different settings: (a) Self-transfer, (b) dimensionality reduction, and (c) model compression. Embedding networks of the methods are denoted by abbreviations: BN–Inception with BatchNorm, R50–ResNet50, R18–ResNet18. Superscripts indicate embedding dimensions of the networks.

2. Comparisons with SOTA metric learning methods

Table 2. Image retrieval performance of the proposed method and the state-of-the-art metric learning models. Embedding networks of the methods are fixed by Inception with BatchNorm (BN) for fair comparisons, and superscripts indicate embedding dimensions.

3. Improvements the quality of the self-supervised and supervised models

(a)

(b)

Table 3. (a) Performance of linear classifiers trained on representations obtained by embedding transfer techniques incorporated with self-supervised learning frameworks. (b) Test accuracy of target models on the CIFAR100 dataset.

Qualitative Results

Figure 3. Figure 5. Top 5 image retrievals of the Proxy-Anchor loss (PA) before and after the proposed method is applied. (a) CUB-2020-2011. (b) Cars-196. (c) SOP. Images with green boundary are success cases and those with red boundary are false positives. More qualitative results can be found in the supplementary material.

Sungyeon Kim¹	Dongwon Kim¹	Minsu Cho^{1, 2}	Suha Kwak^{1, 2}
¹Department of CSE, POSTECH		²Graduate School of AI, POSTECH