TAILIEUCHUNG - Data Mining and Knowledge Discovery Handbook, 2 Edition part 96

Data Mining and Knowledge Discovery Handbook, 2 Edition part 96. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 48 A Review of Web Document Clustering Approaches Nora Oikonomakou1 and Michalis Vazirgiannis2 1 Department of Informatics Athens University of Economics and Business AUEB Patision 76 10434 Greece oikonomn@ 2 Department of Informatics Athens University of Economics and Business AUEB Patision 76 10434 Greece mvazirg@ Summary. Nowadays the Internet has become the largest data repository facing the problem of information overload. Though the web search environment is not ideal. The existence of an abundance of information in combination with the dynamic and heterogeneous nature of the Web makes information retrieval a difficult process for the average user. It is a valid requirement then the development of techniques that can help the users effectively organize and browse the available information with the ultimate goal of satisfying their information need. Cluster analysis which deals with the organization of a collection of objects into cohesive groups can play a very important role towards the achievement of this objective. In this chapter we present an exhaustive survey of web document clustering approaches available on the literature classified into three main categories text-based link-based and hybrid. Furthermore we present a thorough comparison of the algorithms based on the various facets of their features and functionality. Finally based on the review of the different approaches we conclude that although clustering has been a topic for the scientific community for three decades there are still many open issues that call for more research. Key words Clustering World Wide Web Web-Mining Text-Mining Introduction Nowadays the internet has become the largest data repository facing the problem of information overload. In the same time more and more people use the World Wide Web as their main source of information. The existence of an abundance of information in combination with the dynamic and heterogeneous nature of the Web makes information .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.