TAILIEUCHUNG - Báo cáo khoa học: "Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty"

Stochastic gradient descent (SGD) uses approximate gradients estimated from subsets of the training data and updates the parameters in an online fashion. This learning framework is attractive because it often requires much less training time in practice than batch training algorithms. However, L1-regularization, which is becoming popular in natural language processing because of its ability to produce compact models, cannot be efﬁciently applied in SGD training, due to the large dimensions of feature vectors and the ﬂuctuations of approximate gradients. . | Stochastic Gradient Descent Training for Ll-regularized Log-linear Models with Cumulative Penalty Yoshimasa Tsuruoka1 Jun ichi Tsujiitt Sophia Ananiadou1 1 School of Computer Science University of Manchester UK National Centre for Text Mining NaCTeM UK Department of Computer Science University of Tokyo Japan @ Abstract Stochastic gradient descent SGD uses approximate gradients estimated from subsets of the training data and updates the parameters in an online fashion. This learning framework is attractive because it often requires much less training time in practice than batch training algorithms. However L1-regularization which is becoming popular in natural language processing because of its ability to produce compact models cannot be efficiently applied in SGD training due to the large dimensions of feature vectors and the fluctuations of approximate gradients. We present a simple method to solve these problems by penalizing the weights according to cumulative values for L1 penalty. We evaluate the effectiveness of our method in three applications text chunking named entity recognition and part-of-speech tagging. Experimental results demonstrate that our method can produce compact and accurate models much more quickly than a state-of-the-art quasiNewton method for L1-regularized log-linear models. l Introduction Log-linear models maximum entropy models are one of the most widely-used probabilistic models in the field of natural language processing NLP . The applications range from simple classification tasks such as text classification and history-based tagging Ratnaparkhi 1996 to more complex structured prediction tasks such as part-of-speech POS tagging Lafferty et al. 2001 syntactic parsing Clark and Curran 2004 and semantic role labeling Toutanova et al. 2005 . Log-linear models have a major advantage over other discriminative machine learning models such as support vector machines their .

Khuê Trúc 87 9 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Parallel multiclass stochastic gradient descent algorithms for classifying million images with very-high-dimensional signatures into thousands classes

9 91 0

Báo cáo khoa học: "Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty"

9 75 3

Boosting for high-dimensional two-class prediction

17 32 1

Lecture Adaptive filtering - Theory and applications

107 58 0

Bài giảng Tối ưu hóa nâng cao: Chương 9 - Hoàng Nam Dũng

24 104 3

Fine-grained alignment of cryo-electron subtomograms based on MPI parallel optimization

13 24 1

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461991 55

Giới thiệu :Lập trình mã nguồn mở

14 23353 68

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11035 533

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10248 453

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9593 106

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8472 1141

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8313 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7904 2240

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6893 257

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6334 1535

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Đánh giá hao mòn và độ tin cậy của chi tiết và kết cấu trên đầu máy diezel part 3

12 329 0 03-06-2024

Anh văn bằng C-124

8 199 0 03-06-2024

Management and Services Part 1

10 176 0 03-06-2024

Posted prices versus bargaining in markets_7

23 170 0 03-06-2024

QUẢN LÝ CHẤT LƯỢNG KHÔNG KHÍ

75 147 0 03-06-2024

báo cáo hóa học:" Rare ligamentum flavum cyst causing incapacitating lumbar spinal stenosis: Experience with 3 Chinese patients"

4 111 0 03-06-2024

MẪU GIẤY PHÉP VẬN TẢI LOẠI C

2 126 0 03-06-2024

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 115 0 03-06-2024

Điều bạn cần làm để giữ chặt tình yêu

5 119 0 03-06-2024

GYNECOLOGIC CANCERS IN PREGNANCY: GUIDELINES OF AN INTERNATIONAL CONSENSUS MEETING

12 108 0 03-06-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7904 2240

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6334 1535

Ebook Chào con ba mẹ đã sẵn sàng

112 3896 1281

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5512 1148

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8472 1141

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3585 658

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3787 570

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11035 533

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4229 527

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4239 483