TAILIEUCHUNG - Báo cáo khoa học: "STOCHASTIC MODELING OF LANGUAGE VIA SENTENCE SPACE PARTITIONING"

In some computer applications of linguistics (such as maximum-likelihood decoding of speech or handwriting), the purpose of the language-handling component (Language Model) is to estimate the linguistic (a priori) probability of arbitrary natural-language sentences. This paper discusses theoretical and practical issues regarding an approach to building such a language model based on any equivalence criterion defined on incomplete sentences, and experimental results and measurements performed on such a model of the Italian language, which is a part of the prototype for the recognition of spoken Italian built at the IBM Rome Scintific Center. . | STOCHASTIC MODELING OF LANGUAGE VIA SENTENCE SPACE PARTITIONING Alex Martelli IBM Rome Scientific Center via Giorgione 159 ROME Italy ABSTRACT In some computer applications of linguistics such as maximum-likelihood decoding of speech or handwriting the purpose of the language-handling component Language Model is to estimate the linguistic a priori probability of arbitrary natural-language sentences. This paper discusses theoretical and practical issues regarding an approach to building such a language model based on any equivalence criterion defined on incomplete sentences and experimental results and measurements performed on such a model of the Italian language which is a part of the prototype for the recognition of spoken Italian built at the IBM Rome Scintific Center. STOCHASTIC MODELS OF LANGUAGE In some computer applications it is necessary to have a way to estimate the probability of any arbitrary natural-language sentence. A prominent example is maximum-likelihood speech recognition as discussed in 1 4 7 whose underlying mathematical approach can be generalized to recognition of natural language encoded in any medium . handwriting . The subsystem which estimates this probability can be called a stochastic model of the target language. If the sentence is to be recognized while it is being produced as necessary for a real-time application the computation of its probability should proceed left-to-right . word by word from the beginning towards the end of the sentence allowing application of fast tree-search algorithms such as stack decodlng 5 . Left-to-right computation of the probability of any word string is made possible by a formal manipulation based on the definition of conditional probability if Wị is the i-th word in the sequence w of length N then N P W Pp fW J .wt 1 1 In other terms the probability of a sequence of words is the product of the conditional probability of each word given all of the previous ones. As a formal step this holds for .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.