TAILIEUCHUNG - Báo cáo khoa học: "A Structured Language Model"

The paper presents a language model that develops syntactic structure and uses it to extract meaningful information from the word history, thus enabling the use of long distance dependencies. The model assigns probability to every joint sequence of words-binary-parse-structure with headword annotation. The model, its probabilistic parametrization, and a set of experiments meant to evaluate its predictive power are presented. | A Structured Language Model Ciprian Chelba The Johns Hopkins University CLSP Barton Hall 320 3400 N. Charles Street Baltimore MD-21218 Abstract The paper presents a language model that develops syntactic structure and uses it to extract meaningful information from the word history thus enabling the use of long distance dependencies. The model assigns probability to every joint sequence of words-binary-parse-structure with headword annotation. The model its probabilistic parametrization and a set of experiments meant to evaluate its predictive power are presented. the dog I heard yesterday barked Figure 1 Partial parse Figure 2 A word-parse k-prefix 1 Introduction The main goal of the proposed project is to develop a language model LM that uses syntactic structure. The principles that guided this proposal were the model will develop syntactic knowledge as a built-in feature it will assign a probability to every joint sequence of words-binary-parse-structure the model should operate in a left-to-right manner so that it would be possible to decode word lattices provided by an automatic speech recognizer. The model consists of two modules a next word predictor which makes use of syntactic structure as developed by a parser. The operations of these two modules are intertwined. 2 The Basic Idea and Terminology Consider predicting the word barked in the sentence the dog I heard yesterday barked again. A 3-gram approach would predict barked from heard yesterday whereas it is clear that the predictor should use the word dog which is outside the reach of even 4-grams. Our assumption is that what enables us to make a good prediction of barked is the syntactic structure in the past. The correct partial parse of the word history when predicting barked is shown in Figure 1. The word dog is called the headword of the constituent the dog . and dog is an exposed headword when predicting barked topmost headword in the largest constituent that contains it. The .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.