Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion.

TitleCopy number variation detection in whole-genome sequencing data using the Bayesian information criterion.
Publication TypeJournal Article
Year of Publication2011
AuthorsXi, R, Hadjipanayis, AG, Luquette, LJ, Kim, T-M, Lee, E, Zhang, J, Johnson, MD, Muzny, DM, Wheeler, DA, Gibbs, RA, Kucherlapati, R, Park, PJ
JournalProc Natl Acad Sci U S A
Volume108
Issue46
PaginationE1128-36
Date Published2011 Nov 15
ISSN1091-6490
KeywordsAlgorithms, Bayes Theorem, Brain Neoplasms, Comparative Genomic Hybridization, Computer Simulation, DNA Copy Number Variations, Female, Gene Dosage, Genome, Genome, Human, Glioblastoma, Humans, Models, Genetic, Models, Statistical, Sequence Analysis, DNA
Abstract

DNA copy number variations (CNVs) play an important role in the pathogenesis and progression of cancer and confer susceptibility to a variety of human disorders. Array comparative genomic hybridization has been used widely to identify CNVs genome wide, but the next-generation sequencing technology provides an opportunity to characterize CNVs genome wide with unprecedented resolution. In this study, we developed an algorithm to detect CNVs from whole-genome sequencing data and applied it to a newly sequenced glioblastoma genome with a matched control. This read-depth algorithm, called BIC-seq, can accurately and efficiently identify CNVs via minimizing the Bayesian information criterion. Using BIC-seq, we identified hundreds of CNVs as small as 40 bp in the cancer genome sequenced at 10× coverage, whereas we could only detect large CNVs (> 15 kb) in the array comparative genomic hybridization profiles for the same genome. Eighty percent (14/16) of the small variants tested (110 bp to 14 kb) were experimentally validated by quantitative PCR, demonstrating high sensitivity and true positive rate of the algorithm. We also extended the algorithm to detect recurrent CNVs in multiple samples as well as deriving error bars for breakpoints using a Gibbs sampling approach. We propose this statistical approach as a principled yet practical and efficient method to estimate CNVs in whole-genome sequencing data.

DOI10.1073/pnas.1110574108
Alternate JournalProc Natl Acad Sci U S A
PubMed ID22065754
PubMed Central IDPMC3219132
Grant ListR01 GM082798 / GM / NIGMS NIH HHS / United States
RC1 HG005482 / HG / NHGRI NIH HHS / United States
U24 CA144025 / CA / NCI NIH HHS / United States

Similar Publications

Chen F, Zhang Y, Chandrashekar DS, Varambally S, Creighton CJ. Global impact of somatic structural variation on the cancer proteome. Nat Commun. 2023;14(1):5637.
Rhie A, Nurk S, Cechova M, Hoyt SJ, Taylor DJ, Altemose N, et al.. The complete sequence of a human Y chromosome. Nature. 2023;621(7978):344-354.
Saengboonmee C, Sorin S, Sangkhamanon S, Chomphoo S, Indramanee S, Seubwai W, et al.. γ-aminobutyric acid B2 receptor: A potential therapeutic target for cholangiocarcinoma in patients with diabetes mellitus. World J Gastroenterol. 2023;29(28):4416-4432.
Wojcik MH, Reuter CM, Marwaha S, Mahmoud M, Duyzend MH, Barseghyan H, et al.. Beyond the exome: What's next in diagnostic testing for Mendelian conditions. Am J Hum Genet. 2023;110(8):1229-1248.
Chin C-S, Behera S, Khalak A, Sedlazeck FJ, Sudmant PH, Wagner J, et al.. Multiscale analysis of pangenomes enables improved representation of genomic diversity for repetitive and clinically relevant genes. Nat Methods. 2023;20(8):1213-1221.
Calame DG, Guo T, Wang C, Garrett L, Jolly A, Dawood M, et al.. Monoallelic variation in DHX9, the gene encoding the DExH-box helicase DHX9, underlies neurodevelopment disorders and Charcot-Marie-Tooth disease. Am J Hum Genet. 2023;110(8):1394-1413.
Walker KA, Chen J, Shi L, Yang Y, Fornage M, Zhou L, et al.. Proteomics analysis of plasma from middle-aged adults identifies protein markers of dementia risk in later life. Sci Transl Med. 2023;15(705):eadf5681.
Zhao N, Teles F, Lu J, Koestler DC, Beck J, Boerwinkle E, et al.. Epigenome-wide association study using peripheral blood leukocytes identifies genomic regions associated with periodontal disease and edentulism in the Atherosclerosis Risk in Communities study. J Clin Periodontol. 2023;50(9):1140-1153.
Harris RA, McAllister JM, Strauss JF. Single-Cell RNA-Seq Identifies Pathways and Genes Contributing to the Hyperandrogenemia Associated with Polycystic Ovary Syndrome. Int J Mol Sci. 2023;24(13).
Qian X, Srinivasan T, He J, Chen R. The Role of Ceramide in Inherited Retinal Disease Pathology. Adv Exp Med Biol. 2023;1415:303-307.