Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
a new sentence realization framework for text-to-text applications. This framework uses IDL-expressions as a representation formalism, and a generation mechanism based on algorithms for intersecting IDL-expressions with probabilistic language models. We present both theoretical and empirical results concerning the correctness and efficiency of these algorithms. | Towards Developing Generation Algorithms for Text-to-Text Applications Radu Soricut and Daniel Marcu Information Sciences Institute University of Southern California 4676 Admiralty Way Suite 1001 Marina del Rey Ca 90292 radu marcu @isi.edu Abstract We describe a new sentence realization framework for text-to-text applications. This framework uses IDL-expressions as a representation formalism and a generation mechanism based on algorithms for intersecting IDL-expressions with probabilistic language models. We present both theoretical and empirical results concerning the correctness and efficiency of these algorithms. 1 Introduction Many of today s most popular natural language applications - Machine Translation Summarization Question Answering - are text-to-text applications. That is they produce textual outputs from inputs that are also textual. Because these applications need to produce well-formed text it would appear natural that they are the favorite testbed for generic generation components developed within the Natural Language Generation NLG community. Over the years several proposals of generic NLG systems have been made Penman Matthiessen and Bateman 1991 FUF Elhadad 1991 Nitrogen Knight and Hatzivassiloglou 1995 Fergus Bangalore and Rambow 2000 HALogen Langkilde-Geary 2002 Amalgam Corston-Oliver et al. 2002 etc. Instead of relying on such generic NLG systems however most of the current text-to-text applications use other means to address the generation need. In Machine Translation for example sentences are produced using application-specific decoders inspired by work on speech recognition Brown et al. 1993 whereas in Summarization summaries are produced as either extracts or using task-specific strategies Barzilay 2003 . The main reason for which text-to-text applications do not usually involve generic NLG systems is that such applications do not have access to the kind of information that the input representation formalisms of current NLG systems require.