TAILIEUCHUNG - Báo cáo khoa học: "A Statistical Model for Lost Language Decipherment"

In this paper we propose a method for the automatic decipherment of lost languages. Given a non-parallel corpus in a known related language, our model produces both alphabetic mappings and translations of words into their corresponding cognates. We employ a non-parametric Bayesian framework to simultaneously capture both low-level character mappings and highlevel morphemic correspondences. | A Statistical Model for Lost Language Decipherment Benjamin Snyder and Regina Barzilay Kevin Knight CSAIL ISI Massachusetts Institute of Technology University of Southern California bsnyder regina @ knight@ Abstract In this paper we propose a method for the automatic decipherment of lost languages. Given a non-parallel corpus in a known related language our model produces both alphabetic mappings and translations of words into their corresponding cognates. We employ a non-parametric Bayesian framework to simultaneously capture both low-level character mappings and high-level morphemic correspondences. This formulation enables us to encode some of the linguistic intuitions that have guided human decipherers. When applied to the ancient Semitic language Ugaritic the model correctly maps 29 of 30 letters to their Hebrew counterparts and deduces the correct Hebrew cognate for 60 of the Ugaritic words which have cognates in Hebrew. 1 Introduction Dozens of lost languages have been deciphered by humans in the last two centuries. In each case the decipherment has been considered a major intellectual breakthrough often the culmination of decades of scholarly efforts. Computers have played no role in the decipherment any of these languages. In fact skeptics argue that computers do not possess the logic and intuition required to unravel the mysteries of ancient In this paper we demonstrate that at least some of this logic and intuition can be successfully modeled allowing computational tools to be used in the decipherment process. 1 Successful archaeological decipherment has turned out to require a synthesis of logic and intuition . that computers do not andpresumably cannot possess. A. Robinson Lost Languages The Enigma of the World s Undeciphered Scripts 2002 Our definition of the computational decipherment task closely follows the setup typically faced by human decipherers Robinson 2002 . Our input consists of texts in a lost language and a .

TỪ KHÓA LIÊN QUAN
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.