『论文笔记』A Benchmark on Tricks for Large-scale Image Retrieval

2023-10-26 23:34:10

2. Pre-processing Tricks

2.1. Dataset Cleaning for Training

TR1

We noticed by visual inspection that the training set of GLD v1 is clean and reliable, so we used the dataset as is for training. To obtain a semi-supervised learning effect [4, 23], we added virtual classes to the training set. These virtual classes are the clusters from the test and index sets of GLD v1. First, we trained a baseline model using GLD v1 and extracted features of the test and index set of GLD v1. Then, DBSCAN was applied to generate clusters, where each cluster was assigned as a new virtual classes. We call the result TR1 for clarity.

TR2

GLD2的index set中包含很多非地标图片，为了训练模型更好的区分地标和非地标，作者设计了TR2。

和TR1类似，训练后进行聚类，然后“picked several distractor clusters as virtual classes”，将这些干扰标签和TR1的虚拟标签合并，作为TR2。

TR3

The training set of GLD v2 has more classes and images than GLD v1 does，同时也有更多噪声。先用二分类模型移除自然景观图片，然后对每个类别内部进行聚类，当一个类内有多个cluster时，保留最大的cluster。

（同时排除了和TR2重复的类别？）

2.2. Small-scale Validation

从训练集合中抽取2%，分别为测试集合和索引集合。然后包含一个噪音类的虚拟簇（We included a virtual class from a noise cluster），这是因为GLD v2包含有噪音，这样分布能够更加的贴合数据集。

2.3. Experimental Results

作者的目的就是剔出下图中展示的地标图的局部照片。

原始数据集的类内问题：the raw dataset includes images taken from inside, outside, and even partial viewpoints from within the same landmark.These kinds of datasets with large intra-class variation may interfere with learning proper representations in the model, especially when a pair-wise ranking loss is used

原始数据集的类间问题：images of nature scenes also make the training process hard as they have a little iter-class variation.

GLD v2噪声太大，没法直接用于训练。Valid是自建验证集，后面两个指标应该是数据集网站在作者提交模型后给的评分。

Training with TR1, which contains virtual classes from the test and index sets, improves the model performance by using unlabeled data when the original training set is not helpful anymore.

The model trained with TR2 gives performance similar to the model trained with TR1 because the number of data and classes is not noticeably different

Using TR3, which includes a cleaned training set from GLD v2, further improved

3. Learning Representations

3.1. Pooling

3.2. Objectives

Xent + Triplet

Triplet loss + classification loss such as cross-entropy (Xent) loss

N-pair + Angular

参见：https://blog.csdn.net/update7/article/details/112391276

码农公寓

『论文笔记』A Benchmark on Tricks for Large-scale Image Retrieval

2. Pre-processing Tricks

2.1. Dataset Cleaning for Training

TR1

TR2

TR3

2.2. Small-scale Validation

2.3. Experimental Results

3. Learning Representations

3.1. Pooling

3.2. Objectives

Xent + Triplet

N-pair + Angular

3.3. Training a Single Model

码农公寓

2. Pre-processing Tricks

2.1. Dataset Cleaning for Training

TR1

TR2

TR3

2.2. Small-scale Validation

2.3. Experimental Results

3. Learning Representations

3.1. Pooling

3.2. Objectives

Xent + Triplet

N-pair + Angular

3.3. Training a Single Model

相关文章