TAILIEUCHUNG - Data Mining and Knowledge Discovery Handbook, 2 Edition part 39

Data Mining and Knowledge Discovery Handbook, 2 Edition part 39. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 360 Steve Donoho is similar to paper citations in academia. A paper that is cited often is considered to contain important ideas. A paper that is seldom or never cited is considered to be less important. The following paragraphs present two algorithms for incorporating link information into search engines PageRank Page et al. 1998 and Kleinberg s Hubs and Authorities Kleinberg 1999 . The PageRank algorithm takes a set of interconnected pages and calculates a score for each. Intuitively the score for a page is based on how many other pages point to that page and what their scores are. A page that is pointed to by a few other important pages is probably itself important. Similarly a pages that is pointed to by numerous other marginally important pages is probably itself important. But a page that is not pointed to by anything probably isn t important. A more formal definition taken from Page et al. 1998 is Let u be a web page. Then let Fu be the set of pages u points to and Bu be the set of pages that point to u. Let Nu Fu be the number of links from u. Then let E u be an a priori score assigned to u. Then R u the score for u is calculated R u E RNV E u veBu Nv So the score for a page is some constant plus the sum of the scores of its incoming links. Each incoming link has the score of the page it is from divided by the number of outgoing links from that page so a page s score is divided evenly among its outgoing links . The constant E u serves a couple functions. First it counterbalances the effect of sinks in the network. These are pages or groups of pages that are dead ends - they are pointed to but they don t point out to any other pages. E u provides a source of score that counterbalances the sinks in the network. Secondly it provides a method of introducing a priori scores if certain pages are known to be authoritative. The PageRank algorithm can be combined with other techniques to create a search engine. For example PageRank is first used to assign a score to

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.