Hypercorrelation Squeeze for Few-Shot Segmentation

Abstract

Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class. This challenging task requires to understand diverse levels of visual cues and analyze fine-grained correspondence relations between the query and the support images. To address the problem, we propose Hypercorrelation Squeeze Networks (HSNet) that leverages multi-level feature correlation and efficient 4D convolutions. It extracts diverse features from different levels of intermediate convolutional layers and constructs a collection of 4D correlation tensors, i.e., hypercorrelations. Using efficient center-pivot 4D convolutions in a pyramidal architecture, the method gradually squeezes high-level semantic and low-level geometric cues of the hypercorrelation into precise segmentation masks in coarse-to-fine manner. The significant performance improvements on standard few-shot segmentation benchmarks of PASCAL-5ⁱ, COCO-20ⁱ, and FSS-1000 verify the efficacy of the proposed method.

Overall architecture

Figure 1. Overall architecture of the proposed network which consists of three main parts: hypercorrelation construction, 4D-convolutional pyramid encoder, and 2D-convolutional context decoder.

Proposed center-pivot 4D convolution

Figure 2. 4D convolution (left) and weights of 4D kernel [55, 77] (middle) and center-pivot 4D kernel (right). Each black wire that connects two different pixel locations represent a single weight of the 4D kernel. The kernel size used in this example is (3, 3, 3, 3).

Experimental results

1. Resuts on PASCAL-5ⁱ dataset.

Table 1. Performance on PASCAL-5ⁱ in mIoU and FB-IoU. Some results are from [4, 35, 67, 71, 76]. Superscript † denotes our model without support feature masking (Eqn. 1). Numbers in bold indicate the best performance and underlined ones are the second best.

2. Results on COCO-20ⁱ and FSS-1000.

Table 2. Performance on COCO-20ⁱ (left) and FSS-1000 (right) in mIoU and FB-IoU. Some results are from [2, 4, 35, 67, 71, 76].

Acknowledgements

This work was supported by Samsung Advanced Institute of Technology (SAIT), the NRF grant (NRF-2017R1E1A1A01077999), and the IITP grant (No.2019-0-01906, AI Graduate School Program - POSTECH) funded by Ministry of Science and ICT, Korea.

Hypercorrelation Squeeze for Few-Shot Segmentation

Abstract

Overall architecture

Proposed center-pivot 4D convolution

Experimental results

1. Resuts on PASCAL-5ⁱ dataset.

2. Results on COCO-20ⁱ and FSS-1000.

3. Qualitative results

Acknowledgements

Papers

Code

Hypercorrelation Squeeze for Few-Shot Segmentation

Abstract

Overall architecture

Proposed center-pivot 4D convolution

Experimental results

1. Resuts on PASCAL-5i dataset.

2. Results on COCO-20i and FSS-1000.

3. Qualitative results

Acknowledgements

Papers

Code

1. Resuts on PASCAL-5ⁱ dataset.

2. Results on COCO-20ⁱ and FSS-1000.