Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article Impact of Missing Value Imputation on Classification for DNA Microarray Gene Expression | Hindawi Publishing Corporation EURASIP Journal on Bioinformatics and Systems Biology Volume 2009 Article ID 504069 17 pages doi 10.1155 2009 504069 Research Article Impact of Missing Value Imputation on Classification for DNA Microarray Gene Expression Data A Model-Based Study Youting Sun 1 Ulisses Braga-Neto 1 and Edward R. Dougherty1 2 3 1 Department of Electrical and Computer Engineering Texas A M University College Station TX 77843 USA 2 Computational Biology Division Translational Genomics Research Institution Phoenix AZ 85004 USA 3 Department of Bioinformatics and Computational Biology University of Texas M.D. Anderson Cancer Center Houston TX 77030 USA Correspondence should be addressed to Edward R. Dougherty edward@ece.tamu.edu Received 18 September 2009 Revised 30 October 2009 Accepted 25 November 2009 Recommended by Yue Wang Many missing-value MV imputation methods have been developed for microarray data but only a few studies have investigated the relationship between MV imputation and classification accuracy. Furthermore these studies are problematic in fundamental steps such as MV generation and classifier error estimation. In this work we carry out a model-based study that addresses some of the issues in previous studies. Six popular imputation algorithms two feature selection methods and three classification rules are considered. The results suggest that it is beneficial to apply MV imputation when the noise level is high variance is small or gene-cluster correlation is strong under small to moderate MV rates. In these cases if data quality metrics are available then it may be helpful to consider the data point with poor quality as missing and apply one of the most robust imputation algorithms to estimate the true signal based on the available high-quality data points. However at large MV rates we conclude that imputation methods are not recommended. Regarding the MV rate our results indicate the presence of a peaking phenomenon performance of .