Deep reinforcement learning collision avoidance using policy gradient optimisation and Q-learning

Ahmed MAged, Shady; Mikhail, Bishoy H.;

Abstract


Usage of trust region policy optimisation (TRPO) and proximal policy optimisation (PPO) 'children of policy gradient optimisation method' and deep Q-learning network (DQN) in Lidar-based differential robots are proposed using Turtlebot and OpenAI's baselines optimisation methods. The simulation results proved that the three algorithms are ideal for obstacle avoidance and robot navigation with the utter advantage for TRPO and PPO in complex environments. The used policies can be used in a fully decentralised manner as the learned policy is not constrained by any robot parameters or communication protocols.


Other data

Title Deep reinforcement learning collision avoidance using policy gradient optimisation and Q-learning
Authors Ahmed MAged, Shady ; Mikhail, Bishoy H.
Keywords Autonomous | Deep learning | Deep Q-learning | Deep Q-learning network | Differential robot | DQN | Navigation | Obstacle avoidance | PPO | Proximal policy optimisation | Q-learning | Reinforcement learning | Robot operating system | Robotics | ROS | Tensorflow | TRPO | Trust region optimisation | Trust region policy optimisation
Issue Date 1-Jan-2020
Journal International Journal of Computational Vision and Robotics 
ISSN 17529131
DOI 10.1504/IJCVR.2020.107253
Scopus ID 2-s2.0-85085089466

Attached Files

File Description SizeFormat Existing users please Login
IJCVR100306 MAGED_249170.pdf1.85 MBAdobe PDF    Request a copy
Recommend this item

Similar Items from Core Recommender Database

Google ScholarTM

Check

Citations 5 in scopus


Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.