TAILIEUCHUNG - Survey on structure and structure content classification of XML document

In this work, our objective is to give a survey on the classification of XML documents. As XML documents are basically text documents containing the content and structure information, they can be classified based on i) structure only and ii) a combination of both structure and content. This paper gives a brief survey based on this classification. | ISSN:2249-5789 Thasleena N T , International Journal of Computer Science & Communication Networks,Vol 4(1),22-26 Survey on Structure and Structure-content Classification of XML Document Thasleena N T Varghese S C Department of Computer Science Rajagiri School of Engineering & Technology, Kochi thalu555@ Department of Computer Science Rajagiri School of Engineering & Technology, Kochi varghesesc@ Abstract—In recent years, XML has become a popular way of storing many data sets because of its semi-structured nature. It allows modeling of a wide variety of databases as XML documents. XML data thus form a significant part in data mining domain, and it is valuable to develop classification methods for such data. Due to increase in XML documents, researchers are now focusing on applying the typical text mining tasks such as text classification, text clustering and other related tasks on XML corpus. In this work, our objective is to give a survey on the classification of XML documents. As XML documents are basically text documents containing the content and structure information, they can be classified based on i) structure only and ii) a combination of both structure and content. This paper gives a brief survey based on this classification Keywords:XML,classification,clustering,ontology,feature extraction,data mining,frequent pattern,XSLT,WordNet. I. I NTRODUCTION XML is one of the popular structure for data representation that allows organizing textual content into logical structures. In case of traditional information retrieval systems that deal with only flat documents but XML retrieval systems must also take the structure of documents along with its textual contents. Every XML document includes both logical and physical structures. So based on these two information XML can be classified based on two approaches. One approach uses only the structural information of XML data in classification. Another one performs the classification, by .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.