|
[1]S. Liu, J. Jia, S. Fidler, and R. Urtasun, “SGN: Sequential Grouping Networks for Instance Segmentation,” ICCV, pp. 3496-3504, 2017. [2]B. D. Brabandere, D. Neven, and L. V. Gool, “Semantic Instance Segmentation with a Discriminative Loss Function,” arXiv:1708.02551, 2017. [3]J. Dai, K. He, Y. Li, S. Ren, and J. Sun, “Instance-sensitive Fully Convolutional Networks,” arXiv:1603.08678, 2016. [4]K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” ICCV, pp. 2961-2969, 2017. [5]D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: Real-time Instance Segmentation,” ICCV, pp. 9157-9166, 2019. [6]X. Wang, T. Kong, C. Shen, Y. Jiang, and L. Li, “SOLO: Segmenting Objects by Locations,” arXiv:1912.04488, 2020. [7]OpenCV Official documentation, https://docs.opencv.org/ [8]B. D. Lucas and T. Kanade, “An Iterative Image Registration Technique with an Application to Stereo Vision,” IJCAI, 1981. [9]Gunnar Farneback, “Two-Frame Motion Estimation Based on Polynomial Expansion,” LNCS, 2003. [10]National Taiwan University Ph.D Hung-yi Lee Official website, https://speech.ee.ntu.edu.tw/~hylee/ [11]S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, 1997. [12]J. Carreira and A. Zisserman, “Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset,” arXiv:1705.077503v, 2018.
[13]J. Y.H. Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, and G. Toderici , “Beyond Short Snippets: Deep Networks for Video Classification,” arXiv:1503.08909, 2015. [14]J. Donahue, L. A. Hendricks, M. Rohrbach, S. Venugopalan, S. Guadarrama, K. Saenko, and T. Darrell, “Long-term Recurrent Convolutional Networks for Visual Recognition and Description,”arXiv:1411.4389, 2016. [15]G. Thung and H. Jiang, “A torch library for action recognition and detection using CNNs and LSTMs,” ,2016. [16]D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning Spatiotemporal Features with 3D Convolutional Networks,” arXiv:1412.0767v4, 2015. [17]D. Tran, H. Wang, L. Torresani, J. Ray, Y. LeCun, and M. Paluri, “A Closer Look at Spatiotemporal Convolutions for Action Recognition,” CVPR, pp. 6450-6459, 2018. [18]K. Simonyan and A. Zisserman, “Two-Stream Convolutional Networks for Action Recognition in Videos,” NIPS, 2014. [19]E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox, “FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks,” CVPR, pp. 2462-2470, 2017. [20]Pytorch Offical website, https://pytorch.org/ [21]K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” CVPR, pp. 770-778, 2016. [22]Shi-Jie Zhu, “A Distilled 2D CNN-LSTM Framework with Temporal Attention Mechanism for Action Recognition,” 2021. [23]MediaPipe Official website, https://google.github.io/mediapipe/ [24]L. Fan, Z. Xia, X. Zhang, and X. Feng, “Lung nodule detection based on 3D convolutional neural networks,” 2017. [25]K. Simonyan and A. Zisserman “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv:1409.1556v6, 2014. [26]S. Crabtree and P. Beudert. “Scenic Art for the Theatre: History, Tools, and Techniques,” 1998.
|