Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper presents a novel approach for leveraging automatically extracted textual knowledge to improve the performance of control applications such as games. Our ultimate goal is to enrich a stochastic player with highlevel guidance expressed in text. Our model jointly learns to identify text that is relevant to a given game state in addition to learning game strategies guided by the selected text. | Learning to Win by Reading Manuals in a Monte-Carlo Framework S.R.K. Branavan David Silver Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology branavan regina @csail.mit.edu Regina Barzilay Department of Computer Science University College London d.silver@cs.ucl.ac.uk Abstract This paper presents a novel approach for leveraging automatically extracted textual knowledge to improve the performance of control applications such as games. Our ultimate goal is to enrich a stochastic player with high-level guidance expressed in text. Our model jointly learns to identify text that is relevant to a given game state in addition to learning game strategies guided by the selected text. Our method operates in the Monte-Carlo search framework and learns both text analysis and game strategies based only on environment feedback. We apply our approach to the complex strategy game Civilization II using the official game manual as the text guide. Our results show that a linguistically-informed game-playing agent significantly outperforms its language-unaware counterpart yielding a 27 absolute improvement and winning over 78 of games when playing against the built-in AI of Civilization II.1 1 Introduction In this paper we study the task of grounding linguistic analysis in control applications such as computer games. In these applications an agent attempts to optimize a utility function e.g. game score by learning to select situation-appropriate actions. In complex domains finding a winning strategy is challenging even for humans. Therefore human players typically rely on manuals and guides that describe promising tactics and provide general advice about the underlying task. Surprisingly such textual information has never been utilized in control algorithms despite its potential to greatly improve performance. 1The code data and complete experimental setup for this work are available at http groups.csail.mit.edu rbg code civ. The natural .