Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Hypothesis testing for evaluating a multimodal pattern recognition framework applied to speaker detection | Journal of NeuroEngineering and Rehabilitation BioMed Central Open Access Methodology Hypothesis testing for evaluating a multimodal pattern recognition framework applied to speaker detection Patricia Besson and Murat Kunt Address Signal Processing Institute ITS Ecole Polytechnique Fédérale de Lausanne EPFL 1015 Lausanne Switzerland Email Patricia Besson - patricia.besson@univmed.fr Murat Kunt - murat.kunt@epfl.ch Corresponding author Published 27 March 2008 Received 7 February 2007 Journal of NeuroEngineering and Rehabilitation 2008 5 11 doi 10.1186 1743-0003-5-1 1 Accepted 27 March 2008 This article is available from http www.jneuroengrehab.cOm content 5 1 1 1 2008 Besson and Kunt licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License http creativecommons.org licenses by 2.0 which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited. Abstract Background Speaker detection is an important component of many human-computer interaction applications like for example multimedia indexing or ambient intelligent systems. This work addresses the problem of detecting the current speaker in audio-visual sequences. The detector performs with few and simple material since a single camera and microphone meets the needs. Method A multimodal pattern recognition framework is proposed with solutions provided for each step of the process namely the feature generation and extraction steps the classification and the evaluation of the system performance. The decision is based on the estimation of the synchrony between the audio and the video signals. Prior to the classification an information theoretic framework is applied to extract optimized audio features using video information. The classification step is then defined through a hypothesis testing framework in order to get confidence levels associated to the classifier outputs allowing thereby an .