<efrbr:recordSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:efrbr="http://vfrbr.info/efrbr/1.1" xmlns:efrbr-work="http://vfrbr.info/efrbr/1.1/work" xmlns:efrbr-expression="http://vfrbr.info/efrbr/1.1/expression" xmlns:efrbr-manifestation="http://vfrbr.info/efrbr/1.1/manifestation" xmlns:efrbr-person="http://vfrbr.info/efrbr/1.1/person" xmlns:efrbr-corporateBody="http://vfrbr.info/efrbr/1.1/corporateBody" xmlns:efrbr-concept="http://vfrbr.info/efrbr/1.1/concept" xmlns:efrbr-structure="http://vfrbr.info/efrbr/1.1/structure" xmlns:efrbr-responsible="http://vfrbr.info/efrbr/1.1/responsible" xmlns:efrbr-subject="http://vfrbr.info/efrbr/1.1/subject" xmlns:efrbr-other="http://vfrbr.info/efrbr/1.1/other" xsi:schemaLocation="http://vfrbr.info/efrbr/1.1 http://vfrbr.info/schemas/1.1/efrbr.xsd"><efrbr:entities><efrbr-work:work identifier="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513"><efrbr-work:titleOfTheWork>Decentralized bayesian reinforcement learning for online agent collaboration</efrbr-work:titleOfTheWork></efrbr-work:work><efrbr-expression:expression identifier="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513"><efrbr-expression:titleOfTheExpression>Decentralized bayesian reinforcement learning for online agent collaboration</efrbr-expression:titleOfTheExpression><efrbr-expression:formOfExpression vocabulary="DIAS:TYPES">
            Πλήρης Δημοσίευση σε Συνέδριο
            Conference Full Paper
         </efrbr-expression:formOfExpression><efrbr-expression:dateOfExpression type="issued">2015-09-30</efrbr-expression:dateOfExpression><efrbr-expression:dateOfExpression type="published">2012</efrbr-expression:dateOfExpression><efrbr-expression:languageOfExpression vocabulary="iso639-1">en</efrbr-expression:languageOfExpression><efrbr-expression:summarizationOfContent>Solving complex but structured problems in a decentralized manner via multiagent collaboration has received much attention in recent years. This is natural, as on one hand, multiagent systems usu- ally possess a structure that determines the allowable interactions among the agents; and on the other hand, the single most pressing need in a cooperative multiagent system is to coordinate the local policies of autonomous agents with restricted capabilities to serve a system-wide goal. The presence of uncertainty makes this even more challenging, as the agents face the additional need to learn the unknown environment parameters while forming (and follow- ing) local policies in an online fashion. In this paper, we provide the first Bayesian reinforcement learning (BRL) approach for dis- tributed coordination and learning in a cooperative multiagent sys- tem by devising two solutions to this type of problem. More specif- ically, we show how the Value of Perfect Information (VPI) can be used to perform efficient decentralised exploration in both model- based and model-free BRL, and in the latter case, provide a closed form solution for VPI, correcting a decade old result by Dearden, Friedman and Russell. To evaluate these solutions, we present ex- perimental results comparing their relative merits, and demonstrate empirically that both solutions outperform an existing multiagent learning method, representative of the state-of-the-art.</efrbr-expression:summarizationOfContent><efrbr-expression:useRestrictionsOnTheExpression type="creative-commons">http://creativecommons.org/licenses/by/4.0/</efrbr-expression:useRestrictionsOnTheExpression><efrbr-expression:note type="page range">417-424 </efrbr-expression:note><efrbr-expression:note type="conference name">11th International Conference on Autonomous Agents and Multiagent Systems </efrbr-expression:note><efrbr-expression:note type="proceedings title">Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems</efrbr-expression:note></efrbr-expression:expression><efrbr-person:person identifier="10C47CC1-E486-4925-BADB-3D547F30ACD6"><efrbr-person:nameOfPerson vocabulary="">
            Parr G.
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="EF4317A2-3A7E-45E2-9B90-D93DC7EF7F22"><efrbr-person:nameOfPerson vocabulary="">
            Farinelli A.
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="E6105C9A-5684-4C7C-9439-EAB88F5AC6CA"><efrbr-person:nameOfPerson vocabulary="">
            Rogers A. 
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="http://users.isc.tuc.gr/~gchalkiadakis"><efrbr-person:nameOfPerson vocabulary="TUC:LDAP">
            Chalkiadakis Georgios
            Χαλκιαδακης Γεωργιος
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="BE202C78-742F-4805-9C38-A38FDFC06020"><efrbr-person:nameOfPerson vocabulary="">
            Jennings N. R.
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="28BFA296-44A3-447C-BB13-319C4716D540"><efrbr-person:nameOfPerson vocabulary="">
            McClean S.
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="E8F4EA1C-BB12-4E94-AAAB-F258473DA8DE"><efrbr-person:nameOfPerson vocabulary="">
            Teacy W. T. L. 
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-corporateBody:corporateBody identifier="EFFEA7CE-BF83-40B3-8B43-C0F7E54FB859"><efrbr-corporateBody:nameOfTheCorporateBody vocabulary="">
            International Foundation for Autonomous Agents and Multiagent Systems 
            IFAAMS
         </efrbr-corporateBody:nameOfTheCorporateBody></efrbr-corporateBody:corporateBody><efrbr-concept:concept identifier="D96393AC-9A5A-4B24-8D1D-A7869CF7798F"><efrbr-concept:termForTheConcept>
            Multiagent learning
         </efrbr-concept:termForTheConcept></efrbr-concept:concept><efrbr-concept:concept identifier="31AA53D1-BD44-4C1D-B9D0-50F1A73A3344"><efrbr-concept:termForTheConcept>
            Bayesian techniques
         </efrbr-concept:termForTheConcept></efrbr-concept:concept><efrbr-concept:concept identifier="8F17148D-5A3A-4C0D-8DDC-52FB8E2F07FD"><efrbr-concept:termForTheConcept>
            Uncertainty
         </efrbr-concept:termForTheConcept></efrbr-concept:concept></efrbr:entities><efrbr:relationships><efrbr-structure:structureRelations><efrbr-structure:realizedThrough sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="expression" targetURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513"/></efrbr-structure:structureRelations><efrbr-responsible:responsibleRelations><efrbr-responsible:createdBy sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="person" targetURI="10C47CC1-E486-4925-BADB-3D547F30ACD6"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="person" targetURI="10C47CC1-E486-4925-BADB-3D547F30ACD6" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="person" targetURI="EF4317A2-3A7E-45E2-9B90-D93DC7EF7F22" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="person" targetURI="E6105C9A-5684-4C7C-9439-EAB88F5AC6CA" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="person" targetURI="http://users.isc.tuc.gr/~gchalkiadakis" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="person" targetURI="BE202C78-742F-4805-9C38-A38FDFC06020" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="person" targetURI="28BFA296-44A3-447C-BB13-319C4716D540" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="person" targetURI="E8F4EA1C-BB12-4E94-AAAB-F258473DA8DE" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="person" targetURI="EFFEA7CE-BF83-40B3-8B43-C0F7E54FB859" role="publisher"/></efrbr-responsible:responsibleRelations><efrbr-subject:subjectRelations><efrbr-subject:hasSubject sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="concept" targetURI="D96393AC-9A5A-4B24-8D1D-A7869CF7798F"/><efrbr-subject:hasSubject sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="concept" targetURI="31AA53D1-BD44-4C1D-B9D0-50F1A73A3344"/><efrbr-subject:hasSubject sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/E504E59F-683F-49F8-9E3F-41D26D6FC513" targetEntity="concept" targetURI="8F17148D-5A3A-4C0D-8DDC-52FB8E2F07FD"/></efrbr-subject:subjectRelations><efrbr-other:otherRelations/></efrbr:relationships></efrbr:recordSet>