Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Các công cụ định dạng MPEG-7 SpokenContent cung cấp một đại diện tiêu chuẩn hóa tức là ASR, sản lượng của các thông tin ngữ nghĩa (nội dung nói) được chiết xuất bởi một hệ thống ASR từ một tín hiệu nói. | 4.4 APPLICATION SPOKEN DOCUMENT RETRIEVAL 135 Compared with a classical IR approach such as the binary approach of Equation 4.12 non-matching terms are taken into account. In a symmetrical way the D Q model considers the IR problem from the point of view of the document. If a matching query term cannot be found for a given query term tj we look for similar query terms ti based on the similarity term function s t tj . The general formula of the RSV is then RSVd q Q D p d tj .ọ s ti tj q yy eQ 4.22 where p is a function which determines the use that is made of the similarities between a given document term tj and the query terms tị. It is straightforward to apply to the D Q case the RSV expressions given in Equation 4.19 RSVD1q Q D E Y s ti tj .q ti .dj 4.23 and Equation 4.21 RSVm o Q D V s f t .d t .q t with t argmax s t t . 4.24 D Q 7 f J X X IX O xx7 X teD Q According to the nature of the SDR indexing terms different forms of term similarity functions can be defined. In the same way that we have made a distinction in Section 4.4.1.3 between word-based and sub-word- based SDR approaches we will distinguish two forms of term similarities Semantic term similarity when indexing terms are words. In this case each individual indexing term carries some semantic information. Acoustic similarity when indexing terms are sub-word units. In the case of phonetic indexing units we will talk about phonetic similarity. The indexing terms have no semantic meaning in themselves and essentially carry some acoustic information. The corresponding similarity functions and the way they can be used for computing retrieval scores will be presented in the next sections. 4.4.3 Word-Based SDR Word-based SDR is quite similar to text-based IR. Most word-based SDR systems simply process text transcriptions delivered by an ASR system with text retrieval methods. Thus we will mainly review approaches initially developed in the framework of text retrieval. 136 4 SPOKEN CONTENT 4.4.3.1 LVCSR and .