Rollout sampling approximate policy iteration

Dimitrakakis Christos, Lagoudakis Michael

URI	http://purl.tuc.gr/dl/dias/157117EC-5401-47A1-B453-9D39AAFFC2E2	-
Αναγνωριστικό	https://doi.org/10.1007/s10994-008-5069-3	-
Γλώσσα	en	-
Μέγεθος	14	en
Τίτλος	Rollout sampling approximate policy iteration	en
Δημιουργός	Dimitrakakis Christos	en
Δημιουργός	Lagoudakis Michael	en
Δημιουργός	Λαγουδακης Μιχαηλ	el
Εκδότης	Springer Verlag	en
Περιγραφή	Δημοσίευση σε επιστημονικό περιοδικό	el
Περίληψη	Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervised learning problem. This paper proposes variants of an improved policy iteration scheme which addresses the core sampling problem in evaluating a policy through simulation as a multi-armed bandit machine. The resulting algorithm offers comparable performance to the previous algorithm achieved, however, with significantly less computational effort. An order of magnitude improvement is demonstrated experimentally in two standard reinforcement learning domains: inverted pendulum and mountain-car.	en
Τύπος	Peer-Reviewed Journal Publication	en
Τύπος	Δημοσίευση σε Περιοδικό με Κριτές	el
Άδεια Χρήσης	http://creativecommons.org/licenses/by/4.0/	en
Ημερομηνία	2015-10-27	-
Ημερομηνία Δημοσίευσης	2008	-
Θεματική Κατηγορία	Reinforcement learning	en
Θεματική Κατηγορία	Approximate policy iteration	en
Θεματική Κατηγορία	Rollouts	en
Θεματική Κατηγορία	Bandit problems	en
Θεματική Κατηγορία	Classification	en
Θεματική Κατηγορία	Sample complexity	en
Βιβλιογραφική Αναφορά	C. Dimitrakakis and M. G. Lagoudakis "Rollout sampling approximate policy iteration," Machine Learning, vol. 72, no. 3, pp. 157-171, Sept. 2008. doi: 10.1007/s10994-008-5069-3	en

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

Rollout sampling approximate policy iteration

Dimitrakakis Christos, Lagoudakis Michael

Υπηρεσίες

Εξαγωγή

Κοινοποίηση

Στατιστικά

Μεταδεδομένων & Περιεχομένου σε METS:

Μεταδεδομένων σε Μορφότυπο: