Deep reinforcement learning collision avoidance using policy gradient optimisation and Q-learning
Ahmed MAged, Shady; Mikhail, Bishoy H.;
Abstract
Usage of trust region policy optimisation (TRPO) and proximal policy optimisation (PPO) 'children of policy gradient optimisation method' and deep Q-learning network (DQN) in Lidar-based differential robots are proposed using Turtlebot and OpenAI's baselines optimisation methods. The simulation results proved that the three algorithms are ideal for obstacle avoidance and robot navigation with the utter advantage for TRPO and PPO in complex environments. The used policies can be used in a fully decentralised manner as the learned policy is not constrained by any robot parameters or communication protocols.
Other data
Title | Deep reinforcement learning collision avoidance using policy gradient optimisation and Q-learning | Authors | Ahmed MAged, Shady ; Mikhail, Bishoy H. | Keywords | Autonomous | Deep learning | Deep Q-learning | Deep Q-learning network | Differential robot | DQN | Navigation | Obstacle avoidance | PPO | Proximal policy optimisation | Q-learning | Reinforcement learning | Robot operating system | Robotics | ROS | Tensorflow | TRPO | Trust region optimisation | Trust region policy optimisation | Issue Date | 1-Jan-2020 | Journal | International Journal of Computational Vision and Robotics | ISSN | 17529131 | DOI | 10.1504/IJCVR.2020.107253 | Scopus ID | 2-s2.0-85085089466 |
Attached Files
File | Description | Size | Format | Existing users please Login |
---|---|---|---|---|
IJCVR100306 MAGED_249170.pdf | 1.85 MB | Adobe PDF | Request a copy |
Similar Items from Core Recommender Database
Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.