|
[1]機器人作業系統, available from: https://www.ros.org. [2]Base local planner, available from: http://wiki.ros.org/base_local_planner. [3]Dieter Foxy, Wolfram Burgardy, Sebastian Thrun, 1997, “The Dynamic Window Approach to Collision Avoidance,” IEEE Robotics & Automation Magazine, vol. 4, issue 1, pp. 23-33, March. [4]DWA local planner, available from: http://wiki.ros.org/dwa_local_planner. [5]Scott Fujimoto, Herke van Hoof, David Meger, 2018, “Addressing Function Approximation Error in Actor-Critic Methods”, International Conference on Machine Learning, 1582-1591, February. [6]Talebpour, Zeynab, Martinoli, Alcherio, 2018, “Risk-Based Human-Aware Multi-Robot Coordination in Dynamic Environments Shared with Humans”, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3365-3372. [7]Qingyang Tan, Tingxiang Fan, Jia Pan, Dinesh Manocha, 2020, “DeepMNavigate: Deep Reinforced Multi-Robot Navigation Unifying Local & Global Collision Avoidance”, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA (Virtual), October 25-29. [8]Xiaoyun Lei, Zhian Zhang, Peifang Dong, 2018, “Dynamic Path Planning of Unknown Environment Based on Deep Reinforcement Learning”, Journal of Robotics, vol. 2018, September. [9]Matej Dobrevski, Danijel Skočaj, 2020, “Adaptive Dynamic Window Approach for Local Navigation”, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6930-6936, February. [10]Dainel Zhang, Colleen P. Bailey, 2020, “Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with Reward Shaping”, arXiv: 2003.12863, March. [11]Herrn Anton Maximilian Schafer, 2008, “Reinforcement Learning with Recurrent Neural Networks”, University of Osnabruck, Institute for Computer Science Neuroinformatics Group. [12]Lu Wang, Wei Zhang, Xiaofeng He, Hongyuan Zha, 2018, “Supervised Reinforcement Learning with Recurrent Neural Network for Dynamic Treatment Recommendation”, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp.2447-2456, July. [13]Steven Kapturowski, Georg Ostrovski, John Quan, Remi Munos, Will Dabney, 2019, “Recurrent Experience Replay in Distributed Reinforcement Learning”, International Conference on Learning Representations, January. [14]Dynamic Programming, available from: https://en.wikipedia.org/wiki/Dynamic_programming. [15]Supervised Learning, available from: https://en.wikipedia.org/wiki/Supervised_learning. [16]Deep Mind AlphaGo, available from: https://deepmind.com/research/case-studies/alphago-the-story-so-far. [17]Microsoft Bonsai, available from: https://docs.microsoft.com/en-us/bonsai/product. [18]What is the difference between model-based and model-free reinforcement learning? ,available from: https://www.quora.com/What-is-the-difference-between-model-based-and-model-free-reinforcement-learning. [19]RL survey, available from: https://github.com/AI4Finance-LLC/ElegantRL/blob/master/figs/RL_survey_2020.pdf. [20]Mean Square Error, Available from: https://en.wikipedia.org/wiki/Mean_squared_error. [21]Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra, 2016, “Continuous control with deep reinforcement learning”, arXiv: 1509.02971, September. [22]Van Hasselt, Hado, 2010, “Double Q-learning”, pp. 2613-2621, January. [23]Markov Decision Process, available from: https://en.wikipedia.org/wiki/Markov_decision_process. [24]Matthew F. Dixon, Igor Halperin, Paul Bilokon, 2020, “Inverse Reinforcement Learning and Imitation Learning”, Machine Learning in Finance, Springer, Cham. [25]Felipe Codevilla, Eder Santana, Antonio Lopez, Adrien Gaidon, 2019, “Exploring the Limitations of Behavior Cloning for Autonomous Driving”, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9328-9337. [26]ROBOTIS-Turtlebot3. Available from: https://emanual.robotis.com/docs/en/platform/turtlebot3/overview. [27]turtlebot3_simulations open source code. Available from: https://github.com/ROBOTIS-GIT/turtlebot3_simulations. [28]Shreyansh Daftry, J. Andrew Bagnell, Martial Hebert, 2016, “Learning Transferable Policies for Monocular Reactive MAV Control”, arXiv: 1608.00627, August. [29]車聯網, available from: https://zh.wikipedia.org/wiki/%E8%BB%8A%E8%81%AF%E7%B6%B2.
|