TAILIEUCHUNG - Báo cáo khoa học: "An Exact A* Method for Deciphering Letter-Substitution Ciphers"

Letter-substitution ciphers encode a document from a known or hypothesized language into an unknown writing system or an unknown encoding of a known writing system. It is a problem that can occur in a number of practical applications, such as in the problem of determining the encodings of electronic documents in which the language is known, but the encoding standard is not. It has also been used in relation to OCR applications. In this paper, we introduce an exact method for deciphering messages using a generalization of the Viterbi algorithm. . | An Exact A Method for Deciphering Letter-Substitution Ciphers Eric Corlett and Gerald Penn Department of Computer Science University of Toronto ecorlett gpenn @ Abstract Letter-substitution ciphers encode a document from a known or hypothesized language into an unknown writing system or an unknown encoding of a known writing system. It is a problem that can occur in a number of practical applications such as in the problem of determining the encodings of electronic documents in which the language is known but the encoding standard is not. It has also been used in relation to OCR applications. In this paper we introduce an exact method for deciphering messages using a generalization of the Viterbi algorithm. We test this model on a set of ciphers developed from various web sites and find that our algorithm has the potential to be a viable practical method for efficiently solving decipherment problems. 1 Introduction Letter-substitution ciphers encode a document from a known language into an unknown writing system or an unknown encoding of a known writing system. This problem has practical significance in a number of areas such as in reading electronic documents that may use one of many different standards to encode text. While this is not a problem in languages like English and Chinese which have a small set of well known standard encodings such as ASCII Big5 and Unicode there are other languages such as Hindi in which there is no dominant encoding standard for the writing system. In these languages we would like to be able to automatically retrieve and display the information in electronic documents which use unknown encodings when we find them. We also want to use these documents for information retrieval and data mining in which case it is important to be able to read through them automatically without resorting to a human annotator. The holy grail in this area would be an application to archaeological decipherment in which the underlying language s

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.