Δενδρική αναζήτηση Monte Carlo στο παιχνίδι στρατηγικής "Άποικοι του Κατάν"

Karamalegos Emmanouil

Πλήρης Εγγραφή

URI:

http://purl.tuc.gr/dl/dias/1CCC4417-B1AA-4C2E-B33B-0F6BEFDED36E

Έτος

2016

Τύπος

Διπλωματική Εργασία

Άδεια Χρήσης

Λεπτομέρειες

Βιβλιογραφική Αναφορά

Εμμανουήλ Καραμαλέγκος, "Δενδρική αναζήτηση Monte Carlo στο παιχνίδι στρατηγικής "Άποικοι του Κατάν"", Διπλωματική Εργασία, Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών, Πολυτεχνείο Κρήτης, Χανιά, Ελλάς, 2016 https://doi.org/10.26233/heallink.tuc.66891

Εμφανίζεται στις Συλλογές

Διπλωματικές Εργασίες στην Κοινότητα Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών

Περίληψη

Classic approaches to game AI require either a high quality of domain knowledge, ora long time to generate effective AI behavior. Monte Carlo Tree Search (MCTS) is asearch method that combines the precision of tree search with the generality ofrandom sampling. The family of MCTS algorithms has achieved promising resultswith perfect-information games such as Go. In our work, we apply Monte-Carlo TreeSearch to the non-deterministic game "Settlers of Catan", a multi-player board-turnedweb-based game that necessitates strategic planning and negotiation skills. Weimplemented an agent which takes into consideration all the aspects of the game forthe first time, using no domain knowledge.In our work, we are experimenting using a reinforcement learning method Value ofPerfect Information (VPI) and two bandit methods, namely, the Upper CoefficientBound and Bayesian Upper Coefficient Bound methods. Such methods attempt tostrike a balance between exploitation and exploration when creating of the searchtree.For first time, we implemented an agent that takes into consideration the completerules-set of the game and makes it possible to negotiate trading between all players.Furthermore, we included in our agent an alternative initial placement found in theliterature, which is based on the analysis of human behavior in Settlers of Catangames.In our experiments we compare the performance of our methods against each otherand against appropriate benchmarks (e.g., JSettlers agents), and examine the effectthat the number of simulations and the simulation depth has on the algorithms’performance. Our results suggests that VPI scores significantly better than banditbased methods, even if these employ a much higher number of simulations. Inaddition to this, the simulation depth of the algorithm needs to be calculated so themethod will neither get lost in deep simulations of improbable scenarios neither endup shortly without given a proper estimation of the upcoming moves. Finally, ourresults indicate that our agent performance is improved when the alternative, humanbehavior-based, initial placement method.

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

Δενδρική αναζήτηση Monte Carlo στο παιχνίδι στρατηγικής "Άποικοι του Κατάν"

Karamalegos Emmanouil

Περίληψη

Διαθέσιμα αρχεία

Υπηρεσίες

Εξαγωγή

Κοινοποίηση

Στατιστικά

Μεταδεδομένων & Περιεχομένου σε METS:

Μεταδεδομένων σε Μορφότυπο: