On the locality of action domination in sequential decision making

Rachelson, Emmanuel, Lagoudakis Michael

Πλήρης Εγγραφή

URI:

http://purl.tuc.gr/dl/dias/E0292307-A486-42F6-A1D4-8BF6498753E2

Έτος

2010

Τύπος

Πλήρης Δημοσίευση σε Συνέδριο

Άδεια Χρήσης

Λεπτομέρειες

Βιβλιογραφική Αναφορά

E. Rachelson and Michail G. Lagoudakis. (2010, Jan.). On the locality of action domination in sequential decision making. Presented at 11th International Symposium on Artificial Intelligence and Mathematics (ISAIM). [Online]. Available: http://www.researchgate.net/profile/Emmanuel_Rachelson/publication/221186156_On_the_locality_of_action_domination_in_sequential_decision_making/links/0fcfd5051c4eaad94f000000.pdf

Εμφανίζεται στις Συλλογές

Δημοσιεύσεις σε Συνέδρια στην Κοινότητα Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών

Περίληψη

In the field of sequential decision making and reinforcementlearning, it has been observed that good policies for mostproblems exhibit a significant amount of structure. In practice,this implies that when a learning agent discovers an actionis better than any other in a given state, this action actuallyhappens to also dominate in a certain neighbourhoodaround that state. This paper presents new results provingthat this notion of locality in action domination can be linkedto the smoothness of the environment’s underlying stochasticmodel. Namely, we link the Lipschitz continuity of a MarkovDecision Process to the Lispchitz continuity of its policies’value functions and introduce the key concept of influence radiusto describe the neighbourhood of states where the dominatingaction is guaranteed to be constant. These ideas aredirectly exploited into the proposed Localized Policy Iteration(LPI) algorithm, which is an active learning version ofRollout-based Policy Iteration. Preliminary results on the InvertedPendulum domain demonstrate the viability and thepotential of the proposed approach.

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

On the locality of action domination in sequential decision making

Rachelson, Emmanuel, Lagoudakis Michael

Περίληψη

Υπηρεσίες

Εξαγωγή

Κοινοποίηση

Στατιστικά

Μεταδεδομένων & Περιεχομένου σε METS:

Μεταδεδομένων σε Μορφότυπο: