臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.57) 您好！臺灣時間：2025/06/20 15:01

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
目次
參考文獻
電子全文
QR Code

本論文永久網址:

研究生:

鄭博閎

研究生(外文):

CHENG,PO-HUNG

論文名稱:

分散式訓練於消費級運算設備應用探討與優化方式評估

論文名稱(外文):

Survey and Evaluation on Distributed Training using Consumer-grade Computing Devices

指導教授:

朱彥銘

指導教授(外文):

CHU,YEN-MING

口試委員:

廖冠雄、張弘毅、朱彥銘

口試委員(外文):

LIAW, GUAN-HSIUNG、CHANG, HUNG-YI、CHU,YEN-MING

口試日期:

2023-07-27

學位類別:

碩士

校院名稱:

國立高雄科技大學

系所名稱:

資訊管理系

學門:

電算機學門

學類:

電算機一般學類

論文種類:

學術論文

論文出版年:

2023

畢業學年度:

111

語文別:

中文

論文頁數:

中文關鍵詞:

PyTorch、深度學習、分散式訓練、基礎模型、微調訓練

外文關鍵詞:

PyTorch、Deep Learning、Distributed Training、Foundation Model、Fine-tuning Training.

相關次數:

被引用:0
點閱:118
評分:
下載:0
書目收藏:0

[1]R. Bommasani et al., "On the opportunities and risks of foundation models," arXiv preprint arXiv:2108.07258, 2021.
[2]R. OpenAI, "GPT-4 technical report," arXiv, p. 2303.08774, 2023.
[3]H. Touvron et al., "Llama: Open and efficient foundation language models," arXiv preprint arXiv:2302.13971, 2023.
[4]J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.
[5]X. Jia et al., "Highly scalable deep learning training system with mixed-precision: Training imagenet in four minutes," arXiv preprint arXiv:1807.11205, 2018.
[6]H. Mikami, H. Suganuma, Y. Tanaka, and Y. Kageyama, "Imagenet/resnet-50 training in 224 seconds," arXiv preprint arXiv:1811.05233, pp. 770-778, 2018.
[7]T. M. Mitchell, "Machine learning," ed, 1997.
[8]J. R. Quinlan, "Induction of decision trees," Machine learning, vol. 1, pp. 81-106, 1986.
[9]J. R. Quinlan, "Program for machine learning," C4. 5, 1993.
[10]C. Cortes and V. Vapnik, "Support-vector networks," Machine learning, vol. 20, pp. 273-297, 1995.
[11]J. A. Xu and K. Araki, "A SVM-based personal recommendation system for TV programs," in 2006 12th International Multi-Media Modelling Conference, 2006: IEEE, p. 4 pp.
[12]R. Choudhry and K. Garg, "A hybrid machine learning system for stock market forecasting," International Journal of Computer and Information Engineering, vol. 2, no. 3, pp. 689-692, 2008.
[13]J. A. Anderson, An introduction to neural networks. MIT press, 1995.
[14]B. Mahesh, "Machine learning algorithms-a review," International Journal of Science and Research (IJSR).[Internet], vol. 9, no. 1, pp. 381-386, 2020.
[15]Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," nature, vol. 521, no. 7553, pp. 436-444, 2015.
[16]Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[17]S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[18]O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, 2015: Springer, pp. 234-241.
[19]C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 7464-7475.
[20]W. Yin, K. Kann, M. Yu, and H. Schütze, "Comparative study of CNN and RNN for natural language processing," arXiv preprint arXiv:1702.01923, 2017.
[21]D. Weimer, B. Scholz-Reiter, and M. Shpitalni, "Design of deep convolutional neural network architectures for automated feature extraction in industrial inspection," CIRP annals, vol. 65, no. 1, pp. 417-420, 2016.
[22]G. Sperlí, "A deep learning based chatbot for cultural heritage," in Proceedings of the 35th Annual ACM Symposium on Applied Computing, 2020, pp. 935-937.
[23]A. Brutzkus and A. Globerson, "Why do larger models generalize better? A theoretical perspective via the XOR problem," in International Conference on Machine Learning, 2019: PMLR, pp. 822-830.
[24]K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
[25]K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[26]C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9.
[27]B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, "Learning transferable architectures for scalable image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8697-8710.
[28]A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017.
[29]A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, "Improving language understanding by generative pre-training," 2018.
[30]Y. Zhang, A. Warstadt, H.-S. Li, and S. R. Bowman, "When do you need billions of words of pretraining data?," arXiv preprint arXiv:2011.04946, 2020.
[31]A. Srivastava et al., "Beyond the imitation game: Quantifying and extrapolating the capabilities of language models," arXiv preprint arXiv:2206.04615, 2022.
[32]T. Karras, T. Aila, S. Laine, and J. Lehtinen, "Progressive growing of gans for improved quality, stability, and variation," arXiv preprint arXiv:1710.10196, 2017.
[33]R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, "High-resolution image synthesis with latent diffusion models," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10684-10695.
[34]D. P. Kingma and M. Welling, "Auto-encoding variational bayes," arXiv preprint arXiv:1312.6114, 2013.
[35]K. Frans, L. Soros, and O. Witkowski, "Clipdraw: Exploring text-to-drawing synthesis through language-image encoders," Advances in Neural Information Processing Systems, vol. 35, pp. 5207-5218, 2022.
[36]N. Ruiz, Y. Li, V. Jampani, Y. Pritch, M. Rubinstein, and K. Aberman, "Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22500-22510.
[37]E. J. Hu et al., "Lora: Low-rank adaptation of large language models," arXiv preprint arXiv:2106.09685, 2021.
[38]Y. Wang, J. Wang, and X. Zhang, "YNU-HPCC at WASSA-2023 Shared Task 1: Large-scale Language Model with LoRA Fine-Tuning for Empathy Detection and Emotion Classification," in Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis, 2023, pp. 526-530.
[39]Y. Shi, C. Xue, J. Pan, W. Zhang, V. Y. Tan, and S. Bai, "DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing," arXiv preprint arXiv:2306.14435, 2023.
[40]S. Li et al., "Pytorch distributed: Experiences on accelerating data parallel training," arXiv preprint arXiv:2006.15704, 2020.
[41]M. Abadi et al., "Tensorflow: Large-scale machine learning on heterogeneous distributed systems," arXiv preprint arXiv:1603.04467, 2016.
[42]A. Vishnu, C. Siegel, and J. Daily, "Distributed tensorflow with MPI," arXiv preprint arXiv:1603.02339, 2016.
[43]Y. Huang et al., "Gpipe: Efficient training of giant neural networks using pipeline parallelism," Advances in neural information processing systems, vol. 32, 2019.
[44]M. Shoeybi, M. Patwary, R. Puri, P. LeGresley, J. Casper, and B. Catanzaro, "Megatron-lm: Training multi-billion parameter language models using model parallelism," arXiv preprint arXiv:1909.08053, 2019.
[45]S. Rajbhandari, J. Rasley, O. Ruwase, and Y. He, "Zero: Memory optimizations toward training trillion parameter models," in SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, 2020: IEEE, pp. 1-16.
[46]A. Sergeev and M. Del Balso, "Horovod: fast and easy distributed deep learning in TensorFlow," arXiv preprint arXiv:1802.05799, 2018.
[47]S. Gan et al., "Bagua: scaling up distributed learning with system relaxations," arXiv preprint arXiv:2107.01499, 2021.
[48]NVIDIA. "NVIDA." https://www.nvidia.com/zh-tw/ (accessed August, 2023).
[49]Hugging Face. "Hugging Face." https://huggingface.co (accessed August, 2023).
[50]M. Li et al., "Scaling distributed machine learning with the parameter server," in 11th USENIX Symposium on operating systems design and implementation (OSDI 14), 2014, pp. 583-598.
[51]S. Zhang, A. E. Choromanska, and Y. LeCun, "Deep learning with elastic averaging SGD," Advances in neural information processing systems, vol. 28, 2015.
[52]W. Zhang, S. Gupta, X. Lian, and J. Liu, "Staleness-aware async-sgd for distributed deep learning," arXiv preprint arXiv:1511.05950, 2015.
[53]P. Patarasuk and X. Yuan, "Bandwidth optimal all-reduce algorithms for clusters of workstations," Journal of Parallel and Distributed Computing, vol. 69, no. 2, pp. 117-124, 2009.
[54]"Message Passing Interface (MPI) Forum Home Page." MPI Forum. https://www.mpi-forum.org (accessed August, 2023).
[55]NVIDIA. "NVIDIA Collective Communications Library (NCCL)." https://developer.nvidia.com/nccl (accessed August, 2023).
[56]G. Wang, S. Venkataraman, A. Phanishayee, N. Devanur, J. Thelin, and I. Stoica, "Blink: Fast and generic collectives for distributed ml," Proceedings of Machine Learning and Systems, vol. 2, pp. 172-186, 2020.
[57]R. Alvarez, R. Prabhavalkar, and A. Bakhtin, "On the efficient representation and execution of deep acoustic models," arXiv preprint arXiv:1607.04683, 2016.
[58]N. Ström, "Scalable distributed DNN training using commodity GPU cloud computing," 2015.
[59]N. Dryden, T. Moon, S. A. Jacobs, and B. Van Essen, "Communication quantization for data-parallel training of deep neural networks," in 2016 2nd Workshop on Machine Learning in HPC Environments (MLHPC), 2016: IEEE, pp. 1-8.
[60]A. Krizhevsky. "CIFAR-10 and CIFAR-100 datasets." https://www.cs.toronto.edu/~kriz/cifar.html (accessed August, 2023).
[61]P. Goyal et al., "Accurate, large minibatch sgd: Training imagenet in 1 hour," arXiv preprint arXiv:1706.02677, 2017.
[62]"train_dreambooth_lora.py failed on two machines." https://github.com/huggingface/diffusers/issues/3363#issuecomment-1537907210 (accessed August, 2023).
[63]S. Li et al. "PyTorch Data Parallel Best Practices on Google Cloud." https://medium.com/pytorch/pytorch-data-parallel-best-practices-on-google-cloud-6c8da2be180d (accessed 2023).

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	基於容器管理平台實現一雲端整合應用：以深度學習進行人臉辨識為例
2.	利用參數伺服器在深度學習中應用多樣化的通訊最佳化
3.	分散式深度學習系統中處理器間通訊與效能擴展性之模型建構與預測

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室