Accelerating deep reinforcement learning via imitation

Papathanasiou Theodoros

Full record

URI:

http://purl.tuc.gr/dl/dias/668C47FF-2EC6-4DDA-BD5A-9CF6B3A7BF5D

Year

2020

Type of Item

Diploma Work

License

Details

Bibliographic Citation

Theodoros Papathanasiou, "Accelerating deep reinforcement learning via imitation", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2020 https://doi.org/10.26233/heallink.tuc.84657

Appears in Collections

Diploma Works in Community School of Electrical and Computer Engineering

Diploma Works in Community Intelligent Systems Laboratory

Summary

Imitation has evolved in nature as an advanced behavioural tool for knowledge transfer between individuals. It can be observed in most of the higher intelligence life forms such as members of the simian (monkeys & apes), delphinidae (dolphins) and corvus (crows, ravens, jackdows) groups. Its advantages over instinctual acting and habituation can be seen in the vast success of the animals capable of imitation learning, throughout the world's ecosystems.In machine learning mimicry and imitation have been implemented in the form of supervised learning, and have been used in reinforcement learning through explicit imitation techniques. Implicit imitation has been also tested as an alternative to direct knowledge transfer in single and multiagent systems in order to accelerate individual agent learning rates by use of extracted experiences from previous sessions or other cooperative agents. Even though these techniques have achieved promising results, they have not to date taken advantage of the recent success of neural networks and deep learning.In this thesis, we propose the application of implicit imitation on model-free, deep reinforcement learning techniques in order to speed up the learning stages of the respective agents. Briefly, by extracting experience from a mentor agent and augmenting the Bellman backups of another agent to benefit from this experience, we manage to provide a way of guidance. The observer decides whether to trust or disregard that information based on a confidence testing mechanism. We test our model on a DQN variant in classic control environments and demonstrate accelerated learning via our experiments. Though we limit our tests to one deep learning algorithm and simple settings, we comment on extensions of our model to other agents and more complex environments.

Search

Browse

My Space

Accelerating deep reinforcement learning via imitation

Papathanasiou Theodoros

Summary

Available Files

Services

Export

Share

Statistics

Metadata & Content in a METS Package:

Metadata in Format: