臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.220) 您好！臺灣時間：2025/06/21 09:18

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
目次
參考文獻
電子全文
QR Code

本論文永久網址:

研究生:

黃祥益

研究生(外文):

HUANG, XIANG-YI

論文名稱:

一種基於線上會議虛擬代理人之研製

論文名稱(外文):

Development of a Virtual Avatar for Online Conferencing

指導教授:

朱彥銘

指導教授(外文):

CHU, YEN-MING

口試委員:

廖冠雄、張弘毅、朱彥銘

口試委員(外文):

LIAW, GUAN-HSIUNG、CHANG, HUNG-YI、CHU, YEN-MING

口試日期:

2023-07-27

學位類別:

碩士

校院名稱:

國立高雄科技大學

系所名稱:

資訊管理系

學門:

電算機學門

學類:

電算機一般學類

論文種類:

學術論文

論文出版年:

2023

畢業學年度:

111

語文別:

中文

論文頁數:

中文關鍵詞:

虛擬代理人、生成對抗網路、擴散模型、線上會議

外文關鍵詞:

Virtual Agent、Generative Adversarial Networks、Diffusion Model、Online Conferencing

相關次數:

被引用:0
點閱:75
評分:
下載:0
書目收藏:0

[1] A. Al-Habaibeh, M. Watkins, K. Waried, and M. B. Javareshk, "Challenges and opportunities of remotely working from home during Covid-19 pandemic," Global Transitions, vol. 3, pp. 99-108, 2021.
[2] L. Li et al., "Write-a-speaker: Text-based emotional and rhythmic talking-head generation," in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, no. 3, pp. 1911-1920.
[3] Z. Guo, Z. Wang, and X. Jin, "“Avatar to Person”(ATP) Virtual Human Social Ability Enhanced System for Disabled People," Wireless Communications and Mobile Computing, vol. 2021, pp. 1-10, 2021.
[4] E. Kimani, D. Parmar, P. Murali, and T. Bickmore, "Sharing the load online: Virtual presentations with virtual co-presenter agents," in Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, 2021, pp. 1-7.
[5] A. Aliev. "avatarify: Avatars for Zoom, Skype and other video-conferencing apps." [Online]. Available: https://github.com/alievk/avatarify-python. [Accessed: 04-26, 2023].
[6] iperov. "DeepFaceLive: Real-time face swap for PC streaming or video calls." [Online]. Available: https://github.com/iperov/DeepFaceLive. [Accessed: 05-04, 2023].
[7] I. Goodfellow et al., "Generative adversarial nets," Advances in neural information processing systems, vol. 27, 2014.
[8] T. Karras, S. Laine, and T. Aila, "A style-based generator architecture for generative adversarial networks," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4401-4410.
[9] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, "Image-to-image translation with conditional adversarial networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125-1134.
[10] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, "Analyzing and improving the image quality of stylegan," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8110-8119.
[11] M. Dang and T. N. Nguyen, "Digital Face Manipulation Creation and Detection: A Systematic Review," Electronics, vol. 12, no. 16, p. 3407, 2023.
[12] S. Suwajanakorn, S. M. Seitz, and I. Kemelmacher-Shlizerman, "Synthesizing obama: learning lip sync from audio," ACM Transactions on Graphics (ToG), vol. 36, no. 4, pp. 1-13, 2017.
[13] A. Kammoun, R. Slama, H. Tabia, T. Ouni, and M. Abid, "Generative Adversarial Networks for face generation: A survey," ACM Computing Surveys, vol. 55, no. 5, pp. 1-37, 2022.
[14] R. Zhen, W. Song, Q. He, J. Cao, L. Shi, and J. Luo, "Human-computer interaction system: A survey of talking-head generation," Electronics, vol. 12, no. 1, p. 218, 2023.
[15] W. Zhang et al., "SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 8652-8661.
[16] K. Cheng et al., "Videoretalking: Audio-based lip synchronization for talking head video editing in the wild," in SIGGRAPH Asia 2022 Conference Papers, 2022, pp. 1-9.
[17] S. Shen et al., "DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1982-1991.
[18] A. Siarohin, S. Lathuilière, S. Tulyakov, E. Ricci, and N. Sebe, "First order motion model for image animation," Advances in neural information processing systems, vol. 32, 2019.
[19] J. Zhao and H. Zhang, "Thin-plate spline motion model for image animation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 3657-3666.
[20] F.-T. Hong, L. Shen, and D. Xu, "DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation," arXiv preprint arXiv:2305.06225, 2023.
[21] Y. Wang, D. Yang, F. Bremond, and A. Dantcheva, "Latent image animator: Learning to animate images via latent space navigation," arXiv preprint arXiv:2203.09043, 2022.
[22] F. Yin et al., "Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan," in European conference on computer vision, 2022: Springer, pp. 85-101.
[23] Z. Ma, X. Zhu, G.-J. Qi, Z. Lei, and L. Zhang, "OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 16901-16910.
[24] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, "Nerf: Representing scenes as neural radiance fields for view synthesis," Communications of the ACM, vol. 65, no. 1, pp. 99-106, 2021.
[25] Y. Hong, B. Peng, H. Xiao, L. Liu, and J. Zhang, "Headnerf: A real-time nerf-based parametric head model," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 20374-20384.
[26] F.-A. Croitoru, V. Hondru, R. T. Ionescu, and M. Shah, "Diffusion models in vision: A survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
[27] J. Ho, A. Jain, and P. Abbeel, "Denoising diffusion probabilistic models," Advances in neural information processing systems, vol. 33, pp. 6840-6851, 2020.
[28] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, "High-resolution image synthesis with latent diffusion models," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10684-10695.
[29] M. S. Seyfioglu, K. Bouyarmane, S. Kumar, A. Tavanaei, and I. B. Tutar, "DreamPaint: Few-Shot Inpainting of E-Commerce Items for Virtual Try-On without 3D Modeling," arXiv preprint arXiv:2305.01257, 2023.
[30] T. Lüddecke and A. Ecker, "Image segmentation using text and image prompts," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7086-7096.
[31] A. Radford et al., "Learning transferable visual models from natural language supervision," in International conference on machine learning, 2021: PMLR, pp. 8748-8763.
[32] X. Wang, Y. Li, H. Zhang, and Y. Shan, "Towards real-world blind face restoration with generative facial prior," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 9168-9178.
[33] T. Yang, P. Ren, X. Xie, and L. Zhang, "Gan prior embedded network for blind face restoration in the wild," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 672-681.
[34] X. Wang, L. Xie, C. Dong, and Y. Shan, "Real-esrgan: Training real-world blind super-resolution with pure synthetic data," in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 1905-1914.
[35] R. Chen, X. Chen, B. Ni, and Y. Ge, "Simswap: An efficient framework for high fidelity face swapping," in Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 2003-2011.
[36] J. Guo. "InsightFace: 2D and 3D Face Analysis Project." [Online]. Available: https://github.com/deepinsight/insightface. [Accessed: 06-30, 2023].
[37] C. Xu et al., "Designing one unified framework for high-fidelity face reenactment and swapping," in European Conference on Computer Vision, 2022: Springer, pp. 54-71.
[38] Z. Ke, J. Sun, K. Li, Q. Yan, and R. W. Lau, "Modnet: Real-time trimap-free portrait matting via objective decomposition," in Proceedings of the AAAI Conference on Artificial Intelligence, 2022, vol. 36, no. 1, pp. 1140-1147.
[39] danielgatis. "Rembg: Tool to remove images background." [Online]. Available: https://github.com/danielgatis/rembg. [Accessed: 06-02, 2023].
[40] W.-C. Hu, J.-J. Jhu, and C.-P. Lin, "Unsupervised and reliable image matting based on modified spectral matting," Journal of Visual Communication and Image Representation, vol. 23, no. 4, pp. 665-676, 2012.
[41] S. Zhang, X. Zhu, Z. Lei, H. Shi, X. Wang, and S. Z. Li, "S3fd: Single shot scale-invariant face detector," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 192-201.
[42] J. Deng, J. Guo, Y. Zhou, J. Yu, I. Kotsia, and S. Zafeiriou, "Retinaface: Single-stage dense face localisation in the wild," arXiv preprint arXiv:1905.00641, 2019.
[43] V. Albiero, X. Chen, X. Yin, G. Pang, and T. Hassner, "img2pose: Face alignment and detection via 6dof, face pose estimation," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 7617-7627.
[44] S. Sangwan. "roop: one-click face swap." [Online]. Available: https://github.com/s0md3v/roop. [Accessed: 06-30, 2023].
[45] B. Chen, L. Dang, N. Zheng, and J. C. Principe, "Kalman filtering," in Kalman Filtering Under Information Theoretic Criteria: Springer, 2023, pp. 11-51.
[46] J. van Driel, C. N. Olivers, and J. J. Fahrenfort, "High-pass filtering artifacts in multivariate classification of neural time series data," Journal of Neuroscience Methods, vol. 352, p. 109080, 2021.
[47] dlib. " C++ Library High quality face recognition." [Online]. Available: http://dlib.net/ [Accessed: 05-20, 2023].
[48] N. Gourier, D. Hall, and J. L. Crowley, "Estimating face orientation from robust detection of salient facial structures," in FG Net workshop on visual observation of deictic gestures, 2004, vol. 6: Citeseer, p. 7.
[49] T. Karras, T. Aila, S. Laine, and J. Lehtinen, "Progressive Growing of GANs for Improved Quality, Stability, and Variation," arXiv preprint arXiv:1710.10196, 2017.

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

無相關論文

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室