【arXiv 2021】Cluster Contrast for Unsupervised Person Re-Identification

【arXiv 2021】Cluster Contrast for Unsupervised Person Re-Identification

方法概述

1,提出 cluster contrast(聚类对比)来存储特征向量和计算对比损失。
2,展示了 通过聚类级别的内存字典,可以解决聚类特征表达不一致的问题。

文章目录

内容概要

论文名称 简称 会议/期刊 出版年份 baseline backbone 数据集
Cluster Contrast for Unsupervised Person Re-Identification CCU arxiv 2021 【SpCL】Yixiao Ge, Feng Zhu, Dapeng Chen, Rui Zhao, and hong- sheng Li. Self-paced contrastive learning with hybrid mem- ory for domain adaptive object re-id. ImageNet-pretrained [7] ResNet-50 [18]、use DBSCAN [9] for clustering Market- 1501 [52], DukeMTMC-reID [32], MSMT17 [42], Per- sonX [35], and VeRi-776 [26]

在线链接:https://arxiv.org/pdf/2103.11568.pdf
源码链接: https://github.com/alibaba/cluster-contrast-reid

工作概述

1,we present Cluster Contrast which stores feature vectors and computes contrast loss in the cluster level.
2,We demonstrate that the inconsistency problem for cluster feature represen- tation can be solved by the cluster-level memory dictionary

成果概述

By straightforwardly applying Cluster Contrast to a stan- dard unsupervised re-ID pipeline, it achieves considerable improvements of 9.5%, 7.5%, 6.6% compared to state-of- the-art purely unsupervised re-ID methods and 5.1%, 4.0%, 6.5% mAP compared to the state-of-the-art unsupervised domain adaptation re-ID methods on the Market, Duke, and MSMT17 datasets.

方法详解

方法框架

【arXiv 2021】Cluster Contrast for Unsupervised Person Re-Identification

Figure 2: The unsupervised person re-ID pipeline. Feature vectors with the same color belong to the same cluster. The upper part is the memory initialization stage. Training data features are assigned pseudo labels by clustering algorithm. The lower part is the model training stage. Hard exampling method is used to select the hard query instance to update memory feature. The ClusterNCE loss computer contrastive loss between query features and all cluster features.

【arXiv 2021】Cluster Contrast for Unsupervised Person Re-Identification

Figure 3: The comparison to existing memory based non-parametric classification loss.

算法描述

【arXiv 2021】Cluster Contrast for Unsupervised Person Re-Identification

具体实现

1,采用ImageNet预训练过的 resnet 50 ,采用 DBSCAN 为聚类方法。
2,对于内存的初始化。从聚类样本中,随机选择一个样本特征作为聚类的特征。
【arXiv 2021】Cluster Contrast for Unsupervised Person Re-Identification

2,内存的更新,选择和原有聚类特征差别最大的样本特征,和现有聚类特征进行加权求和。
【arXiv 2021】Cluster Contrast for Unsupervised Person Re-Identification

3,损失函数。 求查询样本和聚类特征之间的loss。
【arXiv 2021】Cluster Contrast for Unsupervised Person Re-Identification

实验结果

【arXiv 2021】Cluster Contrast for Unsupervised Person Re-Identification

总体评价

1,在Spcl的基础上,从UDA转为了 USL,在整体的训练loss上,就没有源域部分。另外,内存也不载叫混合内存了,而是单一的存储聚类特征。
2,最有创新性的点在于,一是内存中聚类特征的表示,初始化的时候采用的是 随机采样,更新过程中,采用的是hardest sample。 文章的discussion部分也对这两个机制为何work,做出了解释。
3,总体来说,方法清晰简单,虽然只有三个公式,但最终的实验结果表明,还是很work的。
4.,不过感觉文章写得不那么能够突出文章的contribution。

引用格式

@article{DBLP:journals/corr/abs-2103-11568,
author = {Zuozhuo Dai and
Guangyuan Wang and
Siyu Zhu and
Weihao Yuan and
Ping Tan},
title = {Cluster Contrast for Unsupervised Person Re-Identification},
journal = {CoRR},
volume = {abs/2103.11568},
year = {2021}
}

参考文献

[1] Philip Bachman, R Devon Hjelm, and William Buchwalter. Learning representations by maximizing mutual information across views. arXiv preprint arXiv:1906.00910, 2019. 2
[2] Guangyi Chen, Chunze Lin, Liangliang Ren, Jiwen Lu, and Jie Zhou. Self-critical attention learning for person re- identification. In Proceedings ofthe IEEE/CVF International Conference on Computer Vision, pages 9637–9646, 2019. 2
[3] Zuozhuo Dai, Mingqiang Chen, Xiaodong Gu, Siyu Zhu, and Ping Tan. Batch dropblock network for person re- identification and beyond. In Proceedings ofthe IEEE/CVF International Conference on Computer Vision, pages 3691– 3701, 2019. 2
[4] Zuozhuo Dai, Mingqiang Chen, Siyu Zhu, and Ping Tan. Batch feature erasing for person re-identification and be- yond. arXiv preprint arXiv:1811.07130, 1(2):3, 2018. 2
[5] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009. 5
[6] Weijian Deng, Liang Zheng, Qixiang Ye, Guoliang Kang, Yi Yang, and Jianbin Jiao. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 994–1003, 2018. 1, 2
[7] Martin Ester, Hans-Peter Kriegel, J¨org Sander, Xiaowei Xu, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In Kdd, volume 96, pages 226–231, 1996. 1, 3, 5
[8] Hehe Fan, Liang Zheng, Chenggang Yan, and Yi Yang. Unsupervised person re-identification: Clustering and fine- tuning. ACMTransactions on Multimedia Computing, Com- munications, and Applications (TOMM), 14(4):1–18, 2018. 1, 2
[9] Yang Fu, Yunchao Wei, Guanshuo Wang, Yuqian Zhou, Honghui Shi, and Thomas S Huang. Self-similarity group- ing: A simple unsupervised cross domain adaptation ap- proach for person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6112–6121, 2019. 1, 2
[10] Yixiao Ge, Dapeng Chen, and Hongsheng Li. Mutual mean-teaching: Pseudo label refinery for unsupervised do- main adaptation on person re-identification. arXiv preprint arXiv:2001.01526, 2020. 2, 3, 4, 6
[11] Yixiao Ge, Feng Zhu, Dapeng Chen, Rui Zhao, and hong- sheng Li. Self-paced contrastive learning with hybrid mem- ory for domain adaptive object re-id. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, vol- ume 33, pages 11309–11321. Curran Associates, Inc., 2020. 1, 2, 3, 4, 5, 6, 7
[12] Jianyuan Guo, Yuhui Yuan, Lang Huang, Chao Zhang, Jin-Ge Yao, and Kai Han. Beyond human parts: Dual part-aligned representations for person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3642–3651, 2019. 2
[13] Raia Hadsell, Sumit Chopra, and Yann LeCun. Dimensional- ity reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pat- tern Recognition (CVPR’06), volume 2, pages 1735–1742. IEEE, 2006. 2
[14] Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual rep- resentation learning. In Proceedings ofthe IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 9729–9738, 2020. 2
[15] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceed- ings ofthe IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 3, 5
[16] Olivier Henaff. Data-efficient image recognition with con- trastive predictive coding. In International Conference on Machine Learning, pages 4182–4192. PMLR, 2020. 2
[17] Alexander Hermans, Lucas Beyer, and Bastian Leibe. In de- fense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737, 2017. 1, 2, 3, 4, 5
[18] R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. Learning deep representations by mutual in- formation estimation and maximization. arXiv preprint arXiv:1808.06670, 2018. 2
[19] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal co- variate shift. In International conference on machine learn- ing, pages 448–456. PMLR, 2015. 5, 7
[20] Minxian Li, Xiatian Zhu, and Shaogang Gong. Unsupervised person re-identification by deep learning tracklet association. In Proceedings of the European conference on computer vi- sion (ECCV), pages 737–753, 2018. 6
[21] Minxian Li, Xiatian Zhu, and Shaogang Gong. Unsuper- vised tracklet person re-identification. IEEE transactions on pattern analysis and machine intelligence, 42(7):1770–1782, 2019. 6
[22] Shan Lin, Haoliang Li, Chang-Tsun Li, and Alex Chichung Kot. Multi-task mid-level feature alignment network for unsupervised cross-dataset person re-identification. arXiv preprint arXiv:1807.01440, 2018. 1, 2
[23] Yutian Lin, Xuanyi Dong, Liang Zheng, Yan Yan, and Yi Yang. A bottom-up clustering approach to unsupervised per- son re-identification. In Proceedings of the AAAI Confer- ence on Artificial Intelligence, volume 33, pages 8738–8745, 2019. 1, 2, 6
[24] Yutian Lin, Lingxi Xie, Yu Wu, Chenggang Yan, and Qi Tian. Unsupervised person re-identification via softened similarity learning. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 3390–3399, 2020. 6
[25] Jinxian Liu, Bingbing Ni, Yichao Yan, Peng Zhou, Shuo Cheng, and Jianguo Hu. Pose transferrable person re- identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4099– 4108, 2018. 2
[26] Xinchen Liu, Wu Liu, Huadong Ma, and Huiyuan Fu. Large- scale vehicle re-identification in urban surveillance videos. In 2016 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2016. 5, 6
[27] Hao Luo, Youzhi Gu, Xingyu Liao, Shenqi Lai, and Wei Jiang. Bag of tricks and a strong baseline for deep per- son re-identification. In Proceedings ofthe IEEE/CVF Con- ference on Computer Vision and Pattern Recognition Work- shops, pages 0–0, 2019. 7
[28] James MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA, 1967. 1, 3
[29] Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Repre- sentation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018. 1, 2, 4
[30] Xingang Pan, Ping Luo, Jianping Shi, and Xiaoou Tang. Two at once: Enhancing learning and generalization capacities via ibn-net. In Proceedings of the European Conference on Computer Vision (ECCV), pages 464–479, 2018. 6
[31] John Riccitiello. John riccitiello sets out to identify the engine of growth for unity technologies (interview). Ven- tureBeat. Interview with Dean Takahashi. Retrieved January, 18(3), 2015. 5
[32] Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. Performance measures and a data set for multi-target, multi-camera tracking. In European conference on computer vision, pages 17–35. Springer, 2016. 5, 6
[33] Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clus- tering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815–823, 2015. 1
[34] Jifei Song, Yongxin Yang, Yi-Zhe Song, Tao Xiang, and Timothy M Hospedales. Generalizable person re- identification by domain-invariant mapping network. In Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 719–728, 2019. 2
[35] Xiaoxiao Sun and Liang Zheng. Dissecting person re- identification from the viewpoint of viewpoint. In Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 608–617, 2019. 5, 6
[36] Yonglong Tian, Dilip Krishnan, and Phillip Isola. Con- trastive multiview coding. arXiv preprint arXiv:1906.05849, 2019. 2
[37] Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. Im- proved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6924–6932, 2017. 6
[38] Cheng Wang, Qian Zhang, Chang Huang, Wenyu Liu, and Xinggang Wang. Mancs: A multi-task attentional network with curriculum sampling for person re-identification. In Proceedings of the European Conference on Computer Vi- sion (ECCV), pages 365–381, 2018. 2
[39] DongkaiWang and Shiliang Zhang. Unsupervised person re- identification via multi-label classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10981–10990, 2020. 1, 2, 3, 4, 5, 6
[40] Jingya Wang, Xiatian Zhu, Shaogang Gong, and Wei Li. Transferable joint attribute-identity deep learning for unsu- pervised person re-identification. In Proceedings ofthe IEEE Conference on Computer Vision and Pattern Recognition, pages 2275–2284, 2018. 1, 2
[41] Zhongdao Wang, Jingwei Zhang, Liang Zheng, Yixuan Liu, Yifan Sun, Yali Li, and Shengjin Wang. Cycas: Self- supervised cycle association for learning re-identifiable de- scriptions. arXiv preprint arXiv:2007.07577, 2020. 6
[42] Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. Person transfer gan to bridge domain gap for person re- identification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 79–88, 2018. 1, 2, 5, 6
[43] Chao-Yuan Wu, R Manmatha, Alexander J Smola, and Philipp Krahenbuhl. Sampling matters in deep embedding learning. In Proceedings of the IEEE International Confer- ence on Computer Vision, pages 2840–2848, 2017. 2
[44] Jinlin Wu, Yang Yang, Hao Liu, Shengcai Liao, Zhen Lei, and Stan Z Li. Unsupervised graph association for person re- identification. In Proceedings ofthe IEEE/CVF International Conference on Computer Vision, pages 8321–8330, 2019. 6
[45] Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3733– 3742, 2018. 2, 4
[46] Tong Xiao, Shuang Li, Bochao Wang, Liang Lin, and Xiao- gang Wang. Joint detection and identification feature learn- ing for person search. In Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (CVPR), July 2017. 2, 3
[47] Tong Xiao, Shuang Li, Bochao Wang, Liang Lin, and Xiao- gang Wang. Joint detection and identification feature learn- ing for person search. In Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition, pages 3415–3424, 2017. 6
[48] Hong-Xing Yu, Wei-Shi Zheng, Ancong Wu, Xiaowei Guo, Shaogang Gong, and Jian-Huang Lai. Unsupervised person re-identification by soft multilabel learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 2148–2157, 2019. 1, 2
[49] Kaiwei Zeng, Munan Ning, Yaohua Wang, and Yang Guo. Hierarchical clustering with hard-batch triplet loss for per- son re-identification. In Proceedings ofthe IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 13657–13665, 2020. 6
[50] Yunpeng Zhai, Shijian Lu, Qixiang Ye, Xuebo Shan, Jie Chen, Rongrong Ji, and Yonghong Tian. Ad-cluster: Aug- mented discriminative clustering for domain adaptive person re-identification. In Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pages 9021–9030, 2020. 6
[51] Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, and Zhibo Chen. Densely semantically aligned person re-identification In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 667–676, 2019. 2
[52] Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jing- dong Wang, and Qi Tian. Scalable person re-identification: A benchmark. In Proceedings ofthe IEEE international con- ference on computer vision, pages 1116–1124, 2015. 5, 6
[53] Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li. Re- ranking person re-identification with k-reciprocal encoding. In Proceedings ofthe IEEE Conference on Computer Vision and Pattern Recognition, pages 1318–1327, 2017. 5
[54] Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. Random erasing data augmentation. In Proceedings ofthe AAAIConference on Artificial Intelligence, volume 34, pages 13001–13008, 2020. 5
[55] Zhun Zhong, Liang Zheng, Shaozi Li, and Yi Yang. Gener- alizing a person retrieval model hetero-and homogeneously. In Proceedings ofthe European Conference on Computer Vi- sion (ECCV), pages 172–188, 2018. 1, 2
[56] Zhun Zhong, Liang Zheng, Zhiming Luo, Shaozi Li, and Yi Yang. Invariance matters: Exemplar memory for do- main adaptive person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 598–607, 2019. 1, 2, 6
[57] Sanping Zhou, Fei Wang, Zeyi Huang, and Jinjun Wang. Discriminative feature learning with consistent attention reg- ularization for person re-identification. In Proceedings ofthe IEEE/CVF International Conference on Computer Vision, pages 8040–8049, 2019. 2
[58] Chengxu Zhuang, Alex Lin Zhai, and Daniel Yamins. Local aggregation for unsupervised learning of visual embeddings. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6002–6012, 2019. 2

上一篇:BootstrapTable(附源码) Bootstrap结合BootstrapTable的使用,分为两种模试显示列表。


下一篇:RTTI(Run-Time Type Identification,通过运行时类型识别)