Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper proposes a data-driven method for concept-to-text generation, the task of automatically producing textual output from non-linguistic input. A key insight in our approach is to reduce the tasks of content selection (“what to say”) and surface realization (“how to say”) into a common parsing problem. | Concept-to-text Generation via Discriminative Reranking loannis Konstas and Mirella Lapata Institute for Language Cognition and Computation School of Informatics University of Edinburgh 10 Crichton Street Edinburgh EH8 9AB i.konstas@sms.ed.ac.uk mlap@inf.ed.ac.uk Abstract This paper proposes a data-driven method for concept-to-text generation the task of automatically producing textual output from non-linguistic input. A key insight in our approach is to reduce the tasks of content selection what to say and surface realization how to say into a common parsing problem. We define a probabilistic context-free grammar that describes the structure of the input a corpus of database records and text describing some of them and represent it compactly as a weighted hypergraph. The hypergraph structure encodes exponentially many derivations which we rerank discriminatively using local and global features. We propose a novel decoding algorithm for finding the best scoring derivation and generating in this setting. Experimental evaluation on the ATIS domain shows that our model outperforms a competitive discriminative system both using BLEU and in a judgment elicitation study. 1 Introduction Concept-to-text generation broadly refers to the task of automatically producing textual output from non-linguistic input such as databases of records logical form and expert system knowledge bases Reiter and Dale 2000 . A variety of concept-to-text generation systems have been engineered over the years with considerable success e.g. Dale et al. 2003 Reiter et al. 2005 Green 2006 Turner et al. 2009 . Unfortunately it is often difficult to adapt them across different domains as they rely mostly on handcrafted components. 369 In this paper we present a data-driven approach to concept-to-text generation that is domainindependent conceptually simple and flexible. Our generator learns from a set of database records and textual descriptions for some of them . An example from the air travel .