TAILIEUCHUNG - Improving MapReduce Performance in Heterogeneous Environments

We extracted Mel-Frequency Cepstral Coefﬁcients (MFCCs) features for this task. MFCC are short-term spectral-based features and have been widely used in speech recognition [13] and audio event classiﬁcation. We ex- tracted 12MFCC coefﬁcients from the original audio signal using a sliding window of 40ms at ﬁxed intervals of 20ms. The number of training and testing frames for the different methods is shown in Table 1. Note that there is no need for unusual event training data for our approach. For the un- supervised HMM, there is no need for training data. The percentage of frames for unusual events in the test sequence is around | Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia Andy Konwinski Anthony D. Joseph Randy Katz Ion Stoica University of California Berkeley matei andyk adj randy stoica @ Abstract MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing data mining and scientific simulation. Hadoop is an open-source implementation of MapReduce enjoying wide adoption and is often used for short jobs where low response time is critical. Hadoop s performance is closely tied to its task scheduler which implicitly assumes that cluster nodes are homogeneous and tasks make progress linearly and uses these assumptions to decide when to speculatively re-execute tasks that appear to be stragglers. In practice the homogeneity assumptions do not always hold. An especially compelling setting where this occurs is a virtualized data center such as Amazon s Elastic Compute Cloud EC2 . We show that Hadoop s scheduler can cause severe performance degradation in heterogeneous environments. We design a new scheduling algorithm Longest Approximate Time to End LATE that is highly robust to heterogeneity. LATE can improve Hadoop response times by a factor of 2 in clusters of 200 virtual machines on EC2. 1 Introduction Today s most popular computer applications are Internet services with millions of users. The sheer volume of data that these services work with has led to interest in parallel processing on commodity clusters. The leading example is Google which uses its MapReduce framework to process 20 petabytes of data per day 1 . Other Internet services such as e-commerce websites and social networks also cope with enormous volumes of data. These services generate clickstream data from millions of users every day which is a potential gold mine for understanding access patterns and increasing ad revenue. Furthermore for each user action a web application generates one or two orders of magnitude more .

Mạnh Cường 86 14 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Reproductive and Child Health District Level Household Survey 2002-04

190 69 0

PROJECT MANAGEMENT IN A INFORMATION TECHNOLOGY (IT) WORLD

1 83 0

Trade and Investment for Growth

88 60 1

Strengthening the Role of the Palestine Securities Exchange In Attracting Foreign Investment

12 54 0

Report of the Gynecologic Cancers Progress Review Group

116 47 0

BASICS OF COST ACCOUNTING

304 76 0

Ethical Hacking Techniques to Audit and Secure Web-enabled Applications

5 70 0

Cat Bonds Demystified RMS Guide to the Asset Class

12 74 0

A Short History of Financial Deregulation in the United States

17 69 0

DOOR FURNITURE

25 61 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462379 61

Giới thiệu :Lập trình mã nguồn mở

14 27131 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11386 543

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10584 468

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9865 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8912 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8536 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8111 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 8060 1836

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7313 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 289 4 20-01-2025

báo cáo hóa học:" Increased androgen receptor expression in serous carcinoma of the ovary is associated with an improved survival"

6 164 3 20-01-2025

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 171 1 20-01-2025

Báo cáo nghiên cứu nông nghiệp " Field control of pest fruit flies in Vietnam "

14 196 4 20-01-2025

Bảng màu theo chữ cái – V

11 177 2 20-01-2025

báo cáo hóa học:" Perceptions of rewards among volunteer caregivers of people living with AIDS working in faith-based organizations in South Africa: a qualitative study"

10 165 1 20-01-2025

Báo cáo y học: "The Factors Influencing Depression Endpoints Research (FINDER) study: final results of Italian patients with depressio"

9 157 1 20-01-2025

Báo cáo " Bàn về hành vi pháp luật và hành vi đạo đức "

11 182 2 20-01-2025

Xinh xinh vườn nhà

6 135 0 20-01-2025

Báo cáo khoa học: "A rare coexistence of adrenal cavernous hemangioma with extramedullar hemopoietic tissue: a case report and brief review of the literature"

4 113 0 20-01-2025

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8111 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 8060 1836

Ebook Chào con ba mẹ đã sẵn sàng

112 4469 1379

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6430 1280

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8912 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3875 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3932 610

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4828 568

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11386 543

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4547 490