TAILIEUCHUNG - Báo cáo khoa học: "A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction"

In this work we address the problem of unsupervised part-of-speech induction by bringing together several strands of research into a single model. We develop a novel hidden Markov model incorporating sophisticated smoothing using a hierarchical Pitman-Yor processes prior, providing an elegant and principled means of incorporating lexical characteristics. | A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction Phil Blunsom Department of Computer Science University of Oxford Trevor Cohn Department of Computer Science University of Sheffield Abstract In this work we address the problem of unsupervised part-of-speech induction by bringing together several strands of research into a single model. We develop a novel hidden Markov model incorporating sophisticated smoothing using a hierarchical Pitman-Yor processes prior providing an elegant and principled means of incorporating lexical characteristics. Central to our approach is a new type-based sampling algorithm for hierarchical Pitman-Yor models in which we track fractional table counts. In an empirical evaluation we show that our model consistently out-performs the current state-of-the-art across 10 languages. 1 Introduction Unsupervised part-of-speech PoS induction has long been a central challenge in computational linguistics with applications in human language learning and for developing portable language processing systems. Despite considerable research effort progress in fully unsupervised PoS induction has been slow and modern systems barely improve over the early Brown et al. 1992 approach Christodoulopoulos et al. 2010 . One popular means of improving tagging performance is to include supervision in the form of a tag dictionary or similar however this limits portability and also comprimises any cognitive conclusions. In this paper we present a novel approach to fully unsupervised PoS induction which uniformly outperforms the existing state-of-the-art across all our corpora in 10 different languages. Moreover the performance of our unsupervised model approaches 865 that of many existing semi-supervised systems despite our method not receiving any human input. In this paper we present a Bayesian hidden Markov model HMM which uses a non-parametric prior to infer a latent tagging for a .

Tịnh Lâm 38 10 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Lecture LAN Switching and Wireless - Chapter 1: LAN Design

11 76 0

Lecture CCNA Exploration 4.0 (Kỳ 3) - Chapter 1: LAN Design

70 84 0

Measuring efficiency of a university faculty using a hierarchical network data envelopment analysis model

17 86 0

Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods

18 39 1

Semi-supervised adaptive-height snipping of the hierarchical clustering tree

11 58 1

Synthesis and characterization of hierarchical CEO2 spherical nanoparticles for photocatalytic degradation of methylene blue

10 32 3

Learning decomposed hierarchical feature for better transferability of deep models

13 42 3

Cluster-based routing approach in hierarchical wireless sensor networks toward energy efficiency using genetic algorithm

8 14 1

Adsorptive properties in toluene removal over hierarchical zeolites

10 1 1

Using the Shape Language to Retrieve Hierarchical Data

3 68 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461903 55

Giới thiệu :Lập trình mã nguồn mở

14 22822 64

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10938 531

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10122 449

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9552 104

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8320 1127

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8258 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7878 2222

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6745 253

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 5899 1421

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

extremetech Hacking BlackBerry phần 9

31 257 0 07-05-2024

Trading Strategies Profit Making Techniques For Stock_3

23 187 0 07-05-2024

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 177 0 07-05-2024

MySQL Database Usage & Administration PHẦN 7

37 161 0 07-05-2024

B2B Content Marketing: 2012 Benchmarks, Budgets & Trends

17 140 0 07-05-2024

Hướng dẫn sử dụng Quickoffice cho Ipad và Iphone

13 155 0 07-05-2024

Giáo trình CẤU TRÚC DỮ LIỆU VÀ GIẢI THUẬT - Chương 1

5 129 0 07-05-2024

Khurana et al. Journal of Orthopaedic Surgery and Research 2010, 5:23

7 136 0 07-05-2024

QUẢN LÝ CHẤT LƯỢNG KHÔNG KHÍ

75 140 0 07-05-2024

XỬ TRÍ CHẤN THƯƠNG SỌ NÃO KÍN

1 117 1 07-05-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7878 2222

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 5899 1421

Ebook Chào con ba mẹ đã sẵn sàng

112 3776 1242

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5365 1137

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8320 1127

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3526 646

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10938 531

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3711 525

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4103 519

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4149 480