Michail G. Lagoudakis and Michael L. Littman. (2000, June). Algorithm selection using reinforcement learning. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.472.7494&rep=rep1&type=pdf
Many computational problems can be solved bymultiple algorithms, with different algorithmsfastest for different problem sizes, input distributions,and hardware characteristics. We considerthe problem of algorithm selection: dynamicallychoose an algorithm to attack an instanceof a problem with the goal of minimizingthe overall execution time. We formulate theproblem as a kind of Markov decision process(MDP), and use ideas from reinforcement learningto solve it. This paper introduces a kind ofMDP that models the algorithm selection problemby allowing multiple state transitions. The wellknown Q-learning algorithm is adapted for thiscase in a way that combines both Monte-Carloand Temporal Difference methods. Also, thiswork uses, and extends in a way to control problems,the Least-Squares Temporal Difference algorithm(LSTD(0)) of Boyan. The experimentalstudy focuses on the classic problems of orderstatistic selection and sorting. The encouragingresults reveal the potential of applying learningmethods to traditional computational problems.