Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: A Digital Signal Processing Method for Gene Prediction with Improved Noise Suppression | EURASIP Journal on Applied Signal Processing 2004 1 108-114 2004 Hindawi Publishing Corporation A Digital Signal Processing Method for Gene Prediction with Improved Noise Suppression TrevorW. Fox Research and Development Department Intelligent Engines Corporation 903 42 St. Slv Calgary Alberta Canada T3C-1Y9 Email tfox@bm.net Alex Carreira Department of Electrical and Computer Engineering University of Calgary 2500 University Drive N.W Calgary Alberta Canada T2N1N4 Email aycarrei@shaw.ca Received 1 March 2003 Revised 15 September 2003 It has been observed that the protein-coding regions of DNA sequences exhibit period-three behaviour which can be exploited to predict the location of coding regions within genes. Previously discrete Fourier transform DFT and digital filter-based methods have been used for the identification of coding regions. However these methods do not significantly suppress the noncoding regions in the DNA spectrum at 2n 3. Consequently a noncoding region may inadvertently be identified as a coding region. This paper introduces a new technique a single digital filter operation followed by a quadratic window operation that suppresses nearly all of the noncoding regions. The proposed method therefore improves the likelihood of correctly identifying coding regions in such genes. Keywords and phrases gene prediction digital filter DNA. 1. INTRODUCTION Finding coding regions exons in a DNA strand involves searching amongst the many nucleotides that comprise a DNA strand. Typically a DNA molecule contains millions to hundreds of millions of elements 1 . The problem of finding exons in a DNA sequence is well suited to computers because DNA sequences can be represented by data that is easily processed by a computer. DNA strands can be represented by sequences of letters from a four-character alphabet. Convention dictates the use of the letters A T C and G in each element to represent each of the four distinct nucleotides 1 . A nucleotide has two distinct