TAILIEUCHUNG - Lecture Notes in Computer Science- P16

Lecture Notes in Computer Science- P16:This year, we received about 170 submissions to ICWL 2008. There were a total of 52 full papers, representing an acceptance rate of about 30%, plus one invited paper accepted for inclusion in this LNCS proceedings. The authors of these accepted papers | Web Contents Extracting for Web-Based Learning 65 Training Set CCT2006 in CWIRF1. It consists of 1200 piece of content pages. Dataset3 consists of 184 piece of content page collected from SOHU. Experiment 1 Experiment 1 compares time performance of building Block-List and DOM-Tree. Fig. 4 illustrates accumulating time on building Block-List and DOM-Tree for all web pages in Dataset2. Accumulated time difference on building Block-List and DOM-Tree is increasing while more pages are processed. At the end of experiment building Block-List spends about 30 second lesser than building DOM-Tree. It can be concluded that building Block-List need lesser time than building DOM-Tree. Fig. 4. Comparison of time performance Experiment 2 Experiment 2 evaluates validity of using variance and bending of block distribution to distinguish content pages and non-content pages. Firstly we get block distribution of each web page in Dataset1 and Dataset2 and then compute their variance and bending. Experiment uses Naïve Bayes KNN and ADTree provided by weka2 to conduct classification on dataset1. Table 1 shows results of the classification which use Accuracy as criterion. . correctly labeled documents Accuracy ---------------------------------- all documents Data of experiment in Table 1 shows best classification can be derived by using AD-Tree whose accuracy is . Experiments uses Dataset1 as training set to build classifier based on NB ADTree and KNN respectively. Then it uses these classifiers to conduct classification on dataset2. Table 1 shows result of Experiment where ADTree wrongly classify 81 pieces of web pages which have too short main contents. Experiment and also use Dataset1 as training set to build classifier. Then they conduct classification on Dataset2 by separately using variance and bending. Table 1 shows we can get best accuracy by using variance and bending together than do so by separately using one of two features. 1 http .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.