Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
DNA sequencing technologies hold great promise in generating information that will guide scientists to understand how the genome affects human health and organismal evolution. The process of generating raw genome sequence data becomes cheaper and faster, but more error-prone. | Turkish Journal of Biology http://journals.tubitak.gov.tr/biology/ Research Article Turk J Biol (2018) 42: 471-476 © TÜBİTAK doi:10.3906/biy-1805-42 Evaluation of genome scaffolding tools using pooled clone sequencing Elif DAL , Can ALKAN* Department of Computer Engineering, Faculty of Engineering, Bilkent University, Ankara, Turkey Received: 11.05.2018 Accepted/Published Online: 02.11.2018 Final Version: 10.12.2018 Abstract: DNA sequencing technologies hold great promise in generating information that will guide scientists to understand how the genome affects human health and organismal evolution. The process of generating raw genome sequence data becomes cheaper and faster, but more error-prone. Assembly of such data into high-quality finished genome sequences remains challenging. Many genome assembly tools are available, but they differ in terms of their performance and their final output. More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. Here we evaluate the accuracies of several genome scaffolding algorithms using two different types of data generated from the genome of the same human individual: whole genome shotgun sequencing (WGS) and pooled clone sequencing (PCS). We observe that it is possible to obtain better assemblies if PCS data are used, compared to using only WGS data. However, the current scaffolding algorithms are developed only for WGS, and PCS-aware scaffolding algorithms remain an open problem. Key words: Genome assembly and scaffolding, high-throughput sequencing, pooled clone sequencing, systems biology 1. Introduction Completion of the Human Genome Project (HGP) was one of the greatest achievements in all life sciences research (International Human Genome Sequencing Consortium, 2004). The HGP was started in 1990, and thanks to the innovations in automated genome sequencing technologies, the human genome was completed in 2004. Today, >97% of the human genome is finished and .