TAILIEUCHUNG - Báo cáo khoa học: "Approximation Lasso Methods for Language Modeling"

Lasso is a regularization method for parameter estimation in linear models. It optimizes the model parameters with respect to a loss function subject to model complexities. This paper explores the use of lasso for statistical language modeling for text input. Owing to the very large number of parameters, directly optimizing the penalized lasso loss function is impossible. | Approximation Lasso Methods for Language Modeling Jianfeng Gao Microsoft Research One Microsoft Way Redmond WA 98052 USA jfgao@ Abstract Hisami Suzuki Microsoft Research One Microsoft Way Redmond WA 98052 USA hisamis@ Lasso is a regularization method for parameter estimation in linear models. It optimizes the model parameters with respect to a loss function subject to model complexities. This paper explores the use of lasso for statistical language modeling for text input. Owing to the very large number of parameters directly optimizing the penalized lasso loss function is impossible. Therefore we investigate two approximation methods the boosted lasso BLasso and the forward stagewise linear regression FSLR . Both methods when used with the exponential loss function bear strong resemblance to the boosting algorithm which has been used as a discriminative training method for language modeling. Evaluations on the task of Japanese text input show that BLasso is able to produce the best approximation to the lasso solution and leads to a significant improvement in terms of character error rate over boosting and the traditional maximum likelihood estimation. 1 Introduction Language modeling LM is fundamental to a wide range of applications. Recently it has been shown that a linear model estimated using discriminative training methods such as the boosting and perceptron algorithms outperforms significantly a traditional word trigram model trained using maximum likelihood estimation MLE on several tasks such as speech recognition and Asian language text input Bacchiani et al. 2004 Roark et al. 2004 Gao et al. 2005 Suzuki and Gao 2005 . The success of discriminative training methods is largely due to fact that unlike the traditional approach . MLE that maximizes the function . likelihood of training data that is loosely associated with error rate discriminative training methods aim to directly minimize the error rate on training data even if

Giang Nam 76 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Approximation Lasso Methods for Language Modeling"

8 47 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462340 61

Giới thiệu :Lập trình mã nguồn mở

14 26035 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11345 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10550 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9841 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8889 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8504 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8100 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7744 1790

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7263 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 277 4 27-12-2024

THE ANTHROPOLOGY OF ONLINE COMMUNITIES BY Samuel M.Wilson and Leighton C. Peterson

19 225 4 27-12-2024

Đóng mới oto 8 chỗ ngồi part 9

10 179 3 27-12-2024

Data Structures and Algorithms - Chapter 8: Heaps

41 188 5 27-12-2024

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 146 2 27-12-2024

Báo cáo nghiên cứu nông nghiệp " Field control of pest fruit flies in Vietnam "

14 191 4 27-12-2024

Bảng màu theo chữ cái – V

11 165 2 27-12-2024

báo cáo hóa học:" Quality of data collection in a large HIV observational clinic database in sub-Saharan Africa: implications for clinical research and audit of care"

7 154 4 27-12-2024

Valve Selection Handbook - Fourth Edition

337 146 2 27-12-2024

ETHICAL CODE HANDBOOK: Demonstrate your commitment to high standards

7 147 1 27-12-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8100 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7744 1790

Ebook Chào con ba mẹ đã sẵn sàng

112 4406 1371

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6283 1266

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8889 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3839 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3919 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4708 565

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11345 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4508 490