|
[1]M. Gou, S. Karanam, W. Liu, O. Camps, and R. J. Radke, "DukeMTMC4ReID: A large-scale multi-camera person re-identification dataset," in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 10-19. [2]L. Zheng et al., "Mars: A video benchmark for large-scale person re-identification," in European conference on computer vision, 2016, pp. 868-884: Springer. [3]L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, "Scalable person re-identification: A benchmark," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1116-1124. [4]W. Li, R. Zhao, T. Xiao, and X. Wang, "Deepreid: Deep filter pairing neural network for person re-identification," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 152-159. [5]L. Wei, S. Zhang, W. Gao, and Q. Tian, "Person transfer gan to bridge domain gap for person re-identification," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 79-88. [6]D. Fu et al., "Unsupervised pre-training for person re-identification," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 14750-14759. [7]T. He, X. Jin, X. Shen, J. Huang, Z. Chen, and X.-S. Hua, "Dense interaction learning for video-based person re-identification," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 1490-1501. [8]P. Pathak, A. E. Eshratifar, and M. Gormish, "Video person re-id: Fantastic techniques and where to find them (student abstract)," in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, no. 10, pp. 13893-13894. [9]H. Zhao et al., "Spindle net: Person re-identification with human body region guided feature decomposition and fusion," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1077-1085. [10]H. Liu et al., "Video-based person re-identification with accumulative motion context," vol. 28, no. 10, pp. 2788-2802, 2017. [11]X. Qian et al., "Pose-normalized image generation for person re-identification," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 650-667. [12]H. Zhang et al., "Dino: Detr with improved denoising anchor boxes for end-to-end object detection," 2022. [13]Z. Liu et al., "Swin transformer v2: Scaling up capacity and resolution," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12009-12019. [14]L. Yuan et al., "Florence: A new foundation model for computer vision," 2021. [15]T.-Y. Lin et al., "Microsoft coco: Common objects in context," in European conference on computer vision, 2014, pp. 740-755: Springer. [16]G. Ghiasi et al., "Simple copy-paste is a strong data augmentation method for instance segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 2918-2928. [17]M. Maaz, H. Rasheed, S. Khan, F. S. Khan, R. M. Anwer, and M.-H. J. a. p. a. Yang, "Class-agnostic Object Detection with Multi-modal Transformer," 2021. [18]Y. Zhu, C. Zhao, J. Wang, X. Zhao, Y. Wu, and H. Lu, "Couplenet: Coupling global structure with local parts for object detection," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 4126-4134. [19]M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. J. I. j. o. c. v. Zisserman, "The pascal visual object classes (voc) challenge," vol. 88, no. 2, pp. 303-338, 2010. [20]R. Girshick, "Fast r-cnn," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440-1448. [21]S. Ren, K. He, R. Girshick, and J. J. A. i. n. i. p. s. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," vol. 28, 2015. [22]K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask r-cnn," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961-2969. [23]A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. J. a. p. a. Liao, "Yolov4: Optimal speed and accuracy of object detection," 2020. [24]J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779-788. [25]J. Redmon and A. Farhadi, "YOLO9000: better, faster, stronger," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 7263-7271. [26]J. Redmon and A. J. a. p. a. Farhadi, "Yolov3: An incremental improvement," 2018. [27]G. Jocher et al. (2021). ultralytics/yolov5: v5. 0-YOLOv5-P6 1280 models AWS Supervise. ly and YouTube integrations. Available: https://doi.org/10.5281/zenodo.6222936 [28]Y. He, X. Zhang, and J. Sun, "Channel pruning for accelerating very deep neural networks," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 1389-1397. [29]R. M. Gray and D. L. J. I. t. o. i. t. Neuhoff, "Quantization," vol. 44, no. 6, pp. 2325-2383, 1998. [30]G. Hinton, O. Vinyals, and J. J. a. p. a. Dean, "Distilling the knowledge in a neural network," vol. 2, no. 7, 2015. [31]A. Porrello, L. Bergamini, and S. Calderara, "Robust re-identification by multiple views knowledge distillation," in European Conference on Computer Vision, 2020, pp. 93-110: Springer. [32]J. Park, S. Woo, J.-Y. Lee, and I. S. J. a. p. a. Kweon, "Bam: Bottleneck attention module," 2018. [33]C.-Y. Wang, I.-H. Yeh, and H.-Y. M. J. a. p. a. Liao, "You only learn one representation: Unified network for multiple tasks," 2021. [34]Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, "A convnet for the 2020s," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11976-11986. [35]M. Wieczorek, B. Rychalska, and J. Dąbrowski, "On the unreasonable effectiveness of centroids in image retrieval," in International Conference on Neural Information Processing, 2021, pp. 212-223: Springer. [36]Y. Wen, K. Zhang, Z. Li, and Y. Qiao, "A discriminative feature learning approach for deep face recognition," in European conference on computer vision, 2016, pp. 499-515: Springer. [37]K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778. [38]A. Hermans, L. Beyer, and B. J. a. p. a. Leibe, "In defense of the triplet loss for person re-identification," 2017. [39]F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 815-823. [40]M. Tan and Q. V. J. a. p. a. Le, "Mixconv: Mixed depthwise convolutional kernels," 2019. [41]A. Krizhevsky, I. Sutskever, and G. E. J. A. i. n. i. p. s. Hinton, "Imagenet classification with deep convolutional neural networks," vol. 25, 2012. [42]A. G. Howard et al., "Mobilenets: Efficient convolutional neural networks for mobile vision applications," 2017. [43]M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "Mobilenetv2: Inverted residuals and linear bottlenecks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510-4520. [44]V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," in Icml, 2010. [45]A. F. J. a. p. a. Agarap, "Deep learning using rectified linear units (relu)," 2018. [46]D.-A. Clevert, T. Unterthiner, and S. J. a. p. a. Hochreiter, "Fast and accurate deep network learning by exponential linear units (elus)," 2015. [47]D. Hendrycks and K. J. a. p. a. Gimpel, "Gaussian error linear units (gelus)," 2016. [48]S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," in International conference on machine learning, 2015, pp. 448-456: PMLR. [49]S. J. A. i. n. i. p. s. Ioffe, "Batch renormalization: Towards reducing minibatch dependence in batch-normalized models," vol. 30, 2017. [50]J. L. Ba, J. R. Kiros, and G. E. J. a. p. a. Hinton, "Layer normalization," 2016. [51]F. Tung and G. Mori, "Similarity-preserving knowledge distillation," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1365-1374. [52]D. P. Kingma and J. J. a. p. a. Ba, "Adam: A method for stochastic optimization," 2014. [53]M. Tan, R. Pang, and Q. V. Le, "Efficientdet: Scalable and efficient object detection," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 10781-10790.
|