TAILIEUCHUNG - Báo cáo khoa học: "A Stochastic Finite-State Morphological Parser for Turkish"

This paper presents the ﬁrst stochastic ﬁnite-state morphological parser for Turkish. The non-probabilistic parser is a standard ﬁnite-state transducer implementation of two-level morphology formalism. A disambiguated text corpus of 200 million words is used to stochastize the morphotactics transducer, then it is composed with the morphophonemics transducer to get a stochastic morphological parser. We present two applications to evaluate the effectiveness of the stochastic parser; spelling correction and morphology-based language modeling for speech recognition. . | A Stochastic Finite-State Morphological Parser for Turkish Ha im Sak Tunga Gungor Dept. of Computer Engineering Boga ici University TR-34342 Bebek Istanbul Turkey gungort@ Murat Saraclar Dept. of Electrical Electronics Engineering Bogazici University TR-34342 Bebek Istanbul Turkey Abstract This paper presents the first stochastic finite-state morphological parser for Turkish. The non-probabilistic parser is a standard finite-state transducer implementation of two-level morphology formalism. A disambiguated text corpus of 200 million words is used to stochas-tize the morphotactics transducer then it is composed with the morphophonemics transducer to get a stochastic morphological parser. We present two applications to evaluate the effectiveness of the stochastic parser spelling correction and morphology-based language modeling for speech recognition. 1 Introduction Turkish is an agglutinative language with a highly productive inflectional and derivational morphology. The computational aspects of Turkish morphology have been well studied and several morphological parsers have been built Oflazer 1994 Gungor 1995 . In language processing applications we may need to estimate a probability distribution over all word forms. For example we need probability estimates for unigrams to rank misspelling suggestions for spelling correction. None of the previous studies for Turkish have addressed this problem. For morphologically complex languages estimating a probability distribution over a static vocabulary is not very desirable due to high out-ofvocabulary rates. It would be very convenient for a morphological parser as a word generator analyzer to also output a probability estimate for a word generated analyzed. In this work we build such a stochastic morphological parser for Turkish1 and give two example applications for evaluation. 1The stochastic morphological parser is available for research purposes at http .

Tấn Tài 69 4 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Probability Examples c-8 Stochastic Processes 1

137 98 0

Probability Examples c-9 Stochastic Processes 2

126 101 0

Bài giảng Chapter 3: Stochastic regression model

1 67 1

Stochastic Processes for Finance

104 60 0

Basic Stochastic Processes: A Course Through Exercises

240 63 0

Ebook Probability random variables and stochastic processes (4th edition): Part 2

242 78 0

Investigation of high order stochastic differential equations using averaging method

7 103 0

A numerical scheme for solutions of stochastic advection diffusion equations

9 74 0

Redesign of a supply network by considering stochastic demand

18 78 0

A mathematical model for the product mixing and lot-sizing problem by considering stochastic demand

14 57 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462302 61

Giới thiệu :Lập trình mã nguồn mở

14 24979 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11294 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10514 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9797 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8878 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8468 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8092 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7483 1764

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7196 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 261 4 30-11-2024

B2B Content Marketing: 2012 Benchmarks, Budgets & Trends

17 214 3 30-11-2024

Data Structures and Algorithms - Chapter 8: Heaps

41 173 5 30-11-2024

Báo cáo nghiên cứu khoa học " HÃY LÀM CHO HUẾ XANH HƠN VÀ ĐẸP HƠN "

6 170 3 30-11-2024

Color Atlas of Ophthamology

165 134 2 30-11-2024

báo cáo hóa học:" Perceptions of rewards among volunteer caregivers of people living with AIDS working in faith-based organizations in South Africa: a qualitative study"

10 147 1 30-11-2024

Giáo án điện tử tiểu học môn lịch sử: Cách mạng mùa thu

39 158 1 30-11-2024

Báo cáo nghiên cứu khoa học " Vai trò chính quyền địa phương trong phát triển kinh tế : khu chuyên doanh gốm sứ ( Trung Quốc ) và Bát Tràng ( Việt Nam )("

11 207 1 30-11-2024

Báo cáo nghiên cứu khoa học " Đại hội XVI thông qua điều lệ Đảng cộng sản Trung Quốc những sửa đổi bổ sung mới "

4 156 1 30-11-2024

IT Audit: EMC’s Journey to the Private Cloud

13 150 1 30-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8092 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7483 1764

Ebook Chào con ba mẹ đã sẵn sàng

112 4369 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6162 1259

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8878 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3797 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3911 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4623 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11294 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4460 490