Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper describes the work achieved in the firsthalf of a 4-year cooperative research project ( A R C A D E ) , financed by A U P E L F - U R E F . The project is devoted to the evaluation of parallel text alignment techniques. In its firstperiod ARCADE ran a competition between six systems on a sentence-to-sentence alignment task which yielded two main types of results. First, a large reference bilingual corpus comprising of texts of different genres was created, each presenting various degrees of difficultywith respect to the alignment task. . | Methods and Practical Issues in Evaluating Alignment Techniques Philippe Langlais CTT KTH SE-10044 Stockholm CERI-LIA AGROPARC BP 1228 F-84911 Avignon Cedex 9 Philippe.Langlais@speech.kth.se Michel Simard RALI-DIRO Univ de Montréal Quebec Canada H3C 3J7 simardm@IRO.UMontreal.CA Jean Veronis LPL Univ de Provence 29 Av. R. Schuman F-13621 Aix-en-Provence Cedex 1 veronis univ-aix.fr Abstract This paper describes the work achieved in the first half of a 4-year cooperative research project ARCADE financed by AUPELF-UREF. The project is devoted to the evaluation of parallel text alignment techniques. In its first period ARCADE ran a competition between six systems on a sentence-to-sentence alignment task which yielded two main types of results. First a large reference bilingual corpus comprising of texts of different genres was created each presenting various degrees of difficulty with respect to the alignment task. Second significant methodological progress was made both on the evaluation protocols and metrics and the algorithms used by the different systems. For the second phase which is now underway ARCADE has been opened to a larger number of teams who will tackle the problem of word-level alignment. 1 Introduction In the last few years there has been a growing interest in parallel text alignment techniques. These techniques attempt to map various textual units to their translation and have proven useful for a wide range of applications and tools. A simple example of such a tool is probably the TransSearch bilingual concordancing system Isabelle et al. 1993 which allows a user to query a large archive of existing translations in order to find ready-made solutions to specific translation problems. Such a tool has proved extremely useful not only for translators but also for bilingual lexicographers Langlois 1996 and terminologists Dagan and Church 1994 . More sophisticated applications based on alignment technology have also been the object of recent work such as the