Το έργο με τίτλο Approximate policy iteration using large-margin classifiers από τον/τους δημιουργό/ούς Lagoudakis Michael διατίθεται με την άδεια Creative Commons Αναφορά Δημιουργού 4.0 Διεθνές
Βιβλιογραφική Αναφορά
M.G. Lagoudakis and R. Parr, “Approximate policy iteration using large-margin classifiers,” in Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI), 2003, pp. 1432–1434.
We present an approximate policy iteration algorithmthat uses rollouts to estimate the value of eachaction under a given policy in a subset of states anda classifier to generalize and learn the improvedpolicy over the entire state space. Using a multiclasssupport vector machine as the classifier, weobtained successful results on the inverted pendulumand the bicycle balancing and riding domains.