Approximate policy iteration using large-margin classifiers

Lagoudakis Michael

Πλήρης Εγγραφή

URI:

http://purl.tuc.gr/dl/dias/B95FD666-3683-44DB-8681-8CB3C2DFEC7B

Έτος

2003

Τύπος

Πλήρης Δημοσίευση σε Συνέδριο

Άδεια Χρήσης

Λεπτομέρειες

Βιβλιογραφική Αναφορά

M.G. Lagoudakis and R. Parr, “Approximate policy iteration using large-margin classifiers,” in Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI), 2003, pp. 1432–1434.

Εμφανίζεται στις Συλλογές

Περίληψη

We present an approximate policy iteration algorithmthat uses rollouts to estimate the value of eachaction under a given policy in a subset of states anda classifier to generalize and learn the improvedpolicy over the entire state space. Using a multiclasssupport vector machine as the classifier, weobtained successful results on the inverted pendulumand the bicycle balancing and riding domains.

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

Approximate policy iteration using large-margin classifiers

Lagoudakis Michael

Περίληψη

Διαθέσιμα αρχεία

Υπηρεσίες

Εξαγωγή

Κοινοποίηση

Στατιστικά

Μεταδεδομένων & Περιεχομένου σε METS:

Μεταδεδομένων σε Μορφότυπο: