TAILIEUCHUNG - Báo cáo khoa học: "Harnessing NLP Techniques in the Processes of Multilingual Content Management"

The emergence of the WWW as the main source of distributing content opened the floodgates of information. The sheer volume and diversity of this content necessitate an approach that will reinvent the way it is analysed. The quantitative route to processing information which relies on content management tools provides structural analysis. The challenge we address is to evolve from the process of streamlining data to a level of understanding that assigns value to content. We present an open-source multilingual platform ATALS that incorporates human language technologies in the process of multilingual web content management. . | Harnessing NLP Techniques in the Processes of Multilingual Content Management Anelia Belogay Tetracom IS Ltd. anelia@ Svetla Koeva Institute for Bulgarian Language svetla@ Adam Przepiórkowski Instytut Podstaw Informatyki Polskiej Akademii Nauk adamp@ Dan Cristea Universitatea Alexandru Ioan Cuza dcristea@ Diman Karagyozov Tetracom IS Ltd. diman@ Cristina Vertan Universitaet Hamburg Polivios Raxis Atlantis Consulting SA raxis@ Abstract The emergence of the WWW as the main source of distributing content opened the floodgates of information. The sheer volume and diversity of this content necessitate an approach that will reinvent the way it is analysed. The quantitative route to processing information which relies on content management tools provides structural analysis. The challenge we address is to evolve from the process of streamlining data to a level of understanding that assigns value to content. We present an open-source multilingual platform ATALS that incorporates human language technologies in the process of multilingual web content management. It complements a content management software-as-a-service component i-Publisher used for creating running and managing dynamic content-driven websites with a linguistic platform. The platform enriches the content of these websites with revealing details and reduces the manual work of classification editors by automatically categorising content. The platform ASSET supports six European languages. We expect ASSET to serve as a basis for future development of deep analysis tools capable of generating abstractive summaries and training models for decision making systems. Introduction The advent of the Web revolutionized the way in which content is manipulated and delivered. As a result digital content in various languages has become widely available on the Internet and its sheer volume and language diversity have

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.