TAILIEUCHUNG - Báo cáo khoa học: "Grounded Language Modeling for Automatic Speech Recognition of Sports Video"

Grounded language models represent the relationship between words and the non-linguistic context in which they are said. This paper describes how they are learned from large corpora of unlabeled video, and are applied to the task of automatic speech recognition of sports video. Results show that grounded language models improve perplexity and word error rate over text based language models, and further, support video information retrieval better than human generated speech transcriptions. | Grounded Language Modeling for Automatic Speech Recognition of Sports Video Michael Fleischman Massachusetts Institute of Technology Media Laboratory mbf@ Deb Roy Massachusetts Institute of Technology Media Laboratory dkroy@ Abstract Grounded language models represent the relationship between words and the non-linguistic context in which they are said. This paper describes how they are learned from large corpora of unlabeled video and are applied to the task of automatic speech recognition of sports video. Results show that grounded language models improve perplexity and word error rate over text based language models and further support video information retrieval better than human generated speech transcriptions. 1 Introduction Recognizing speech in broadcast video is a necessary precursor to many multimodal applications such as video search and summarization Snoek and Worring 2005 . Although performance is often reasonable in controlled environments such as studio news rooms automatic speech recognition ASR systems have significant difficulty in noisier settings such as those found in live sports broadcasts Wactlar et al. 1996 . While many researches have examined how to compensate for such noise using acoustic techniques few have attempted to leverage information in the visual stream to improve speech recognition performance for an exception see Murkherjee and Roy 2003 . In many types of video however visual context can provide valuable clues as to what has been said. For example in video of Major League Baseball games the likelihood of the phrase home run increases dramatically when a home run has actually been hit. This paper describes a method for incorporating such visual information in an ASR system for sports video. The method is based on the use of grounded language models to repre sent the relationship between words and the non-linguistic context to which they refer Fleischman and Roy 2007 . Grounded language models are based on .

Thiên Thanh 106 9 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: Exploiting Social Information in Grounded Language Learning via Grammatical Reductions""

9 67 0

Báo cáo khoa học: "Fast Online Lexicon Learning for Grounded Language Acquisition"

10 70 0

Báo cáo khoa học: "Grounded Language Modeling for Automatic Speech Recognition of Sports Video"

9 84 0

Difficulties in language acquisition of English and Japanese all at once

6 34 2

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462284 61

Giới thiệu :Lập trình mã nguồn mở

14 24841 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11281 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10508 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9785 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8463 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8089 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7465 1763

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7185 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

báo cáo hóa học:" Increased androgen receptor expression in serous carcinoma of the ovary is associated with an improved survival"

6 150 3 22-11-2024

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 149 1 22-11-2024

Báo cáo nghiên cứu khoa học " HÃY LÀM CHO HUẾ XANH HƠN VÀ ĐẸP HƠN "

6 167 3 22-11-2024

Sử dụng mô hình ARCH và GARCH để phân tích và dự báo về giá cổ phiếu trên thị trường chứng khoán

24 1064 2 22-11-2024

Báo cáo " Thẩm quyền quản lí nhà nước đối với hoạt động quảng cáo thực trạng và hướng hoàn thiện "

7 196 7 22-11-2024

Word Games with English 1

65 129 1 22-11-2024

Báo cáo nghiên cứu khoa học " NÂNG QUAN HỆ KINH TẾ THƯƠNG MẠI VIỆT NAM - TRUNG QUỐC LÊN TẦM CAO THỜI ĐẠI "

8 158 1 22-11-2024

longman english 1

5 119 0 22-11-2024

Business English Lesson – Advanced Level's archiveFinance (1)

8 107 0 22-11-2024

CÔNG NGHỆ MÔI TRƯỜNG - CHƯƠNG 5 CƠ SỞ QUÁ TRÌNH XỬ LÝ SINH HỌC

1 132 0 22-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8089 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7465 1763

Ebook Chào con ba mẹ đã sẵn sàng

112 4364 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6149 1258

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3786 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3909 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4614 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11281 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4447 490