|
[1] A. Al-Habaibeh, M. Watkins, K. Waried, and M. B. Javareshk, "Challenges and opportunities of remotely working from home during Covid-19 pandemic," Global Transitions, vol. 3, pp. 99-108, 2021. [2] L. Li et al., "Write-a-speaker: Text-based emotional and rhythmic talking-head generation," in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, no. 3, pp. 1911-1920. [3] Z. Guo, Z. Wang, and X. Jin, "“Avatar to Person”(ATP) Virtual Human Social Ability Enhanced System for Disabled People," Wireless Communications and Mobile Computing, vol. 2021, pp. 1-10, 2021. [4] E. Kimani, D. Parmar, P. Murali, and T. Bickmore, "Sharing the load online: Virtual presentations with virtual co-presenter agents," in Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, 2021, pp. 1-7. [5] A. Aliev. "avatarify: Avatars for Zoom, Skype and other video-conferencing apps." [Online]. Available: https://github.com/alievk/avatarify-python. [Accessed: 04-26, 2023]. [6] iperov. "DeepFaceLive: Real-time face swap for PC streaming or video calls." [Online]. Available: https://github.com/iperov/DeepFaceLive. [Accessed: 05-04, 2023]. [7] I. Goodfellow et al., "Generative adversarial nets," Advances in neural information processing systems, vol. 27, 2014. [8] T. Karras, S. Laine, and T. Aila, "A style-based generator architecture for generative adversarial networks," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4401-4410. [9] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, "Image-to-image translation with conditional adversarial networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125-1134. [10] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, "Analyzing and improving the image quality of stylegan," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8110-8119. [11] M. Dang and T. N. Nguyen, "Digital Face Manipulation Creation and Detection: A Systematic Review," Electronics, vol. 12, no. 16, p. 3407, 2023. [12] S. Suwajanakorn, S. M. Seitz, and I. Kemelmacher-Shlizerman, "Synthesizing obama: learning lip sync from audio," ACM Transactions on Graphics (ToG), vol. 36, no. 4, pp. 1-13, 2017. [13] A. Kammoun, R. Slama, H. Tabia, T. Ouni, and M. Abid, "Generative Adversarial Networks for face generation: A survey," ACM Computing Surveys, vol. 55, no. 5, pp. 1-37, 2022. [14] R. Zhen, W. Song, Q. He, J. Cao, L. Shi, and J. Luo, "Human-computer interaction system: A survey of talking-head generation," Electronics, vol. 12, no. 1, p. 218, 2023. [15] W. Zhang et al., "SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 8652-8661. [16] K. Cheng et al., "Videoretalking: Audio-based lip synchronization for talking head video editing in the wild," in SIGGRAPH Asia 2022 Conference Papers, 2022, pp. 1-9. [17] S. Shen et al., "DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1982-1991. [18] A. Siarohin, S. Lathuilière, S. Tulyakov, E. Ricci, and N. Sebe, "First order motion model for image animation," Advances in neural information processing systems, vol. 32, 2019. [19] J. Zhao and H. Zhang, "Thin-plate spline motion model for image animation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 3657-3666. [20] F.-T. Hong, L. Shen, and D. Xu, "DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation," arXiv preprint arXiv:2305.06225, 2023. [21] Y. Wang, D. Yang, F. Bremond, and A. Dantcheva, "Latent image animator: Learning to animate images via latent space navigation," arXiv preprint arXiv:2203.09043, 2022. [22] F. Yin et al., "Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan," in European conference on computer vision, 2022: Springer, pp. 85-101. [23] Z. Ma, X. Zhu, G.-J. Qi, Z. Lei, and L. Zhang, "OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 16901-16910. [24] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, "Nerf: Representing scenes as neural radiance fields for view synthesis," Communications of the ACM, vol. 65, no. 1, pp. 99-106, 2021. [25] Y. Hong, B. Peng, H. Xiao, L. Liu, and J. Zhang, "Headnerf: A real-time nerf-based parametric head model," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 20374-20384. [26] F.-A. Croitoru, V. Hondru, R. T. Ionescu, and M. Shah, "Diffusion models in vision: A survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. [27] J. Ho, A. Jain, and P. Abbeel, "Denoising diffusion probabilistic models," Advances in neural information processing systems, vol. 33, pp. 6840-6851, 2020. [28] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, "High-resolution image synthesis with latent diffusion models," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10684-10695. [29] M. S. Seyfioglu, K. Bouyarmane, S. Kumar, A. Tavanaei, and I. B. Tutar, "DreamPaint: Few-Shot Inpainting of E-Commerce Items for Virtual Try-On without 3D Modeling," arXiv preprint arXiv:2305.01257, 2023. [30] T. Lüddecke and A. Ecker, "Image segmentation using text and image prompts," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7086-7096. [31] A. Radford et al., "Learning transferable visual models from natural language supervision," in International conference on machine learning, 2021: PMLR, pp. 8748-8763. [32] X. Wang, Y. Li, H. Zhang, and Y. Shan, "Towards real-world blind face restoration with generative facial prior," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 9168-9178. [33] T. Yang, P. Ren, X. Xie, and L. Zhang, "Gan prior embedded network for blind face restoration in the wild," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 672-681. [34] X. Wang, L. Xie, C. Dong, and Y. Shan, "Real-esrgan: Training real-world blind super-resolution with pure synthetic data," in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 1905-1914. [35] R. Chen, X. Chen, B. Ni, and Y. Ge, "Simswap: An efficient framework for high fidelity face swapping," in Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 2003-2011. [36] J. Guo. "InsightFace: 2D and 3D Face Analysis Project." [Online]. Available: https://github.com/deepinsight/insightface. [Accessed: 06-30, 2023]. [37] C. Xu et al., "Designing one unified framework for high-fidelity face reenactment and swapping," in European Conference on Computer Vision, 2022: Springer, pp. 54-71. [38] Z. Ke, J. Sun, K. Li, Q. Yan, and R. W. Lau, "Modnet: Real-time trimap-free portrait matting via objective decomposition," in Proceedings of the AAAI Conference on Artificial Intelligence, 2022, vol. 36, no. 1, pp. 1140-1147. [39] danielgatis. "Rembg: Tool to remove images background." [Online]. Available: https://github.com/danielgatis/rembg. [Accessed: 06-02, 2023]. [40] W.-C. Hu, J.-J. Jhu, and C.-P. Lin, "Unsupervised and reliable image matting based on modified spectral matting," Journal of Visual Communication and Image Representation, vol. 23, no. 4, pp. 665-676, 2012. [41] S. Zhang, X. Zhu, Z. Lei, H. Shi, X. Wang, and S. Z. Li, "S3fd: Single shot scale-invariant face detector," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 192-201. [42] J. Deng, J. Guo, Y. Zhou, J. Yu, I. Kotsia, and S. Zafeiriou, "Retinaface: Single-stage dense face localisation in the wild," arXiv preprint arXiv:1905.00641, 2019. [43] V. Albiero, X. Chen, X. Yin, G. Pang, and T. Hassner, "img2pose: Face alignment and detection via 6dof, face pose estimation," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 7617-7627. [44] S. Sangwan. "roop: one-click face swap." [Online]. Available: https://github.com/s0md3v/roop. [Accessed: 06-30, 2023]. [45] B. Chen, L. Dang, N. Zheng, and J. C. Principe, "Kalman filtering," in Kalman Filtering Under Information Theoretic Criteria: Springer, 2023, pp. 11-51. [46] J. van Driel, C. N. Olivers, and J. J. Fahrenfort, "High-pass filtering artifacts in multivariate classification of neural time series data," Journal of Neuroscience Methods, vol. 352, p. 109080, 2021. [47] dlib. " C++ Library High quality face recognition." [Online]. Available: http://dlib.net/ [Accessed: 05-20, 2023]. [48] N. Gourier, D. Hall, and J. L. Crowley, "Estimating face orientation from robust detection of salient facial structures," in FG Net workshop on visual observation of deictic gestures, 2004, vol. 6: Citeseer, p. 7. [49] T. Karras, T. Aila, S. Laine, and J. Lehtinen, "Progressive Growing of GANs for Improved Quality, Stability, and Variation," arXiv preprint arXiv:1710.10196, 2017.
|