TAILIEUCHUNG - Báo cáo khoa học: "INTEGRATING WITH WORD BOUNDARY IDENTIFICATION SENTENCE UNDERSTANDING"

Chinese sentences are written with no special delimiters such as space to indicate word boundaries. Existing Chinese NLP systems therefore employ preprocessors to segment sentences into words. Contrary to the conventional wisdom of separating this issue from the task of sentence understanding, we propose an integrated model that performs word boundary identification in lockstep with sentence understanding. In this approach, there is no distinction between rules for word boundary identification and rules for sentence understanding. These two functions are combined. . | INTEGRATING WORD BOUNDARY IDENTIFICATION WITH SENTENCE UNDERSTANDING Kok Wee Gan Department of Information Systems Computer Science National University of Singapore Kent Ridge Crescent Singapore 0511 Internet gankw@ Abstract Chinese sentences are written with no special delimiters such as space to indicate word boundaries. Existing Chinese NLP systems therefore employ preprocessors to segment sentences into words. Contrary to the conventional wisdom of separating this issue from the task of sentence understanding we propose an integrated model that performs word boundary identification in lockstep with sentence understanding. In this approach there is no distinction between rules for word boundary identification and rules for sentence understanding. These two functions are combined. Word boundary ambiguities are detected especially the fallacious ones when they block the primary task of discovering the inter-relationships among the various constituents of a sentence which essentially is the essence of the understanding process. In this approach statistical information is also incorporated providing the system a quick and fairly reliable starting ground to carry out the primary task of relationship- building. 1 THE PROBLEM Chinese sentences are written with no special delimiters such as space to indicate word boundaries. Existing Chinese NLP systems therefore employ preprocessors to segment sentences into words. Many techniques have been developed for this task from simple pattern matching methods . maximum matching reverse maximum matching Wang et al. 1990 Kang Zheng 1991 to statistical methods . word association relaxation Sproat Shih 1990 Fan Tsai 1988 to rule-based approaches Huang 1989 Yeh Lee 1991 He et al. 1991 . However it is observed that simple pattern matching methods and stochastic methods perform poorly in sentences such as 1 2 and 3 where word boundary ambiguities exist. 1 1 ta benren sheng le She alone give birth to ASP san ge haizi .

Minh Hà 49 3 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Integrating the environment

84 86 0

Factors that influence Zambian higher education lecturer’s attitude towards integrating ICTs in teaching and research

25 97 0

Ebook Integrating technology into the curriculum: Part 1

122 1 1

Ebook Integrating technology into the curriculum: Part 2

104 1 1

OpenVPN Building and Integrating Virtual Private Networks (2006)

270 60 0

cisco migrationn_Integrating Virtual Machines into the Cisco Data

28 68 0

Báo cáo sinh học: " Influence of insulators on transgene expression from integrating and non-integrating lentiviral vectors"

1 39 0

Sense and Respond Logistics - Integrating Prediction, Responsiveness, and Control Capabilities

1 87 0

The Constitution of Private Governance Product Standards in the Regulation of Integrating Markets

498 62 0

Báo cáo khoa học: "Integrating surprisal and uncertain-input models in online sentence comprehension: formal techniques and empirical results"

11 75 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462048 59

Giới thiệu :Lập trình mã nguồn mở

14 23708 74

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11114 535

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10347 458

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9633 106

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8615 1148

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8355 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7937 2249

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6972 260

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6669 1603

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

THE ANTHROPOLOGY OF ONLINE COMMUNITIES BY Samuel M.Wilson and Leighton C. Peterson

19 187 2 24-06-2024

HƯỚNG DẪN SỬ DỤNG PHẦN MỀM CAITA part 9

18 152 0 24-06-2024

XỬ TRÍ CHẤN THƯƠNG SỌ NÃO KÍN

1 145 2 24-06-2024

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 125 0 24-06-2024

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 111 0 24-06-2024

Thương hiệu sản phẩm làng nghề: Đã ít, lại thiếu tính cạnh tranh

5 139 0 24-06-2024

Tự học thổi sáo và ngâm thơ part 4

11 173 1 24-06-2024

The Constituents of Medicinal Plants

185 136 0 24-06-2024

ĐỀ THI THỬ ĐH NĂM 2011 MÔN VẬT LÍ _ ĐỀ SỐ 101

7 119 0 24-06-2024

ĐỀ ÔN TẬP THI ĐH & CĐ NĂM 2011 MÔN VẬT LÍ

6 112 0 24-06-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7937 2249

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6669 1603

Ebook Chào con ba mẹ đã sẵn sàng

112 3992 1298

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5665 1190

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8615 1148

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3630 664

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3843 601

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4353 542

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11114 535

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4289 483