Skip to Content

SNP Discovery

BCM-HGSC Mutation Discovery Pipeline

Single Nucleotide Mutations

Sequencing reads are compared with their respective amplicon reference sequences using a modification of SNPDetector, which employs a relaxed Het peak ratio threshold to compensate for possible heterogeneity of the tumor tissue sample. We use Polyphred 6.0b as a backup discovery method and capture any high-scoring variation missed (very rarely) by SNPDetector. Special mention should be made of our collaboration with Dr. Jinghui Zhang in the laboratory of Dr Ken Beautow, on the development and calibration of SNPdetector. This software has consistently outperformed other routines for the direct discovery of heterozygotes.

Putative polymorphisms accumulated from this analysis are annotated with the following information:

  1. Chromosome and global coordinates

  2. Coincidence with known variation (dbSNP current build, as well as local databases of newly identified variants)

  3. Functional information:

    1. gene compartment (intron, exon, splice junction)

    2. non-synonymous amino acid change if any

    3. position of non-synonymous amino acid in protein

    4. BLOSSUM62 score of variant amino acid compared to reference

Novel SNPs with recognizable functional potential (e.g., non-synonymous SNP or splice junctional variants) are further evaluated. First, they are visually inspected at the trace level and those that are not clearly noise are passed on to experimental validation, currently pyro-sequencing. We plan to resequence with Sanger reads the matched normal tissue in patients with mutations passing pyro-sequencing validation.

All putative genotypes of each individual at each mutation position, along with annotation and validation status will be stored in local databases. Reports are formatted for submission to common data repositories according to protocols jointly established.

Structural Variation Discovery

We are in the process of evaluating Polyphred 6.0b and a new module for SNPDetector designed for detecting intra-exonic indels in biallelic resequencing traces. One or both of these will be used for indel discovery and characterization. Genotype frequencies of constitutional variants (i.e., known SNPs) will be tracked since they might reveal commonly deleted genes or gene segments (LOH) through departures fro, Hardy-Weinberg equilibrium.

Quality Control

Sequencing coverage is a critical factor leading to variation discovery. We track coverage using the SNPDetector program rather than by a single base quality measure. Bases are judged to be covered in a given read if SNPDetector is able to make a call at any given position regardless of their Phred quality score (although there is a high correlation between Phred quality score and SNPDetector coverage).



Related Publications

Corticotropin releasing hormone (CRH) gene variation: comprehensive resequencing for variant and molecular haplotype discovery in monosomic hybrid cell lines., Shimmin, Lawrence C., Natarajan Sivamani, Ibarguen Heladio, Montasser May, Kim Do-Kyun, Hanis Craig L., Boerwinkle Eric, Wadhwa Pathik D., and Hixson James E. , DNA sequence : the journal of DNA sequencing and mapping, 2007 Dec, Volume 18, Issue 6, p.434-44, (2007) Abstract
Sequence variation in the soluble epoxide hydrolase gene and subclinical coronary atherosclerosis: interaction with cigarette smoking., Wei, Qi, Doris Peter A., Pollizotto Martin V., Boerwinkle Eric, Jacobs David R., Siscovick David S., and Fornage Myriam , Atherosclerosis, 2007 Jan, Volume 190, Issue 1, p.26-34, (2007) Abstract
Identification of a novel tumor suppressor gene p34 on human chromosome 6q25.1., Wang, Min, Vikis Haris G., Wang Yian, Jia Dongmei, Wang Daolong, Bierut Laura J., Bailey-Wilson Joan E., Amos Christopher I., Pinney Susan M., Petersen Gloria M., et al. , Cancer research, 2007 Jan 1, Volume 67, Issue 1, p.93-9, (2007) Abstract
An entropy-based genome-wide transmission/disequilibrium test., Zhao, Jinying, Boerwinkle Eric, and Xiong Momiao , Human genetics, 2007 May, Volume 121, Issue 3-4, p.357-67, (2007) Abstract
Sequence variation in DOCK9 and heterogeneity in bipolar disorder., Detera-Wadleigh, Sevilla D., Liu Chun-yu, Maheshwari Manjula, Cardona Imer, Corona Winston, Akula Nirmala, Steele C. J. M., Badner Judith A., Kundu Mukta, Kassem Layla, et al. , Psychiatric genetics, 2007 Oct, Volume 17, Issue 5, p.274-86, (2007) Abstract
A second generation human haplotype map of over 3.1 million SNPs., Frazer, Kelly A., Ballinger Dennis G., Cox David R., Hinds David A., Stuve Laura L., Gibbs Richard A., Belmont John W., Boudreau Andrew, Hardenbol Paul, Leal Suzanne M., et al. , Nature, 2007 Oct 18, Volume 449, Issue 7164, p.851-61, (2007) Abstract
Analysis of the genome-wide variations among multiple strains of the plant pathogenic bacterium Xylella fastidiosa., Doddapaneni, Harshavardhan, Yao Jiqiang, Lin Hong, Walker Andrew M., and Civerolo Edwin L. , BMC genomics, 2006, Volume 7, p.225, (2006) Abstract
Consistent effects of genes involved in reverse cholesterol transport on plasma lipid and apolipoprotein levels in CARDIA participants., Klos, Kathy L. E., Sing Charles F., Boerwinkle Eric, Hamon Sara C., Rea Thomas J., Clark Andrew, Fornage Myriam, and Hixson James E. , Arteriosclerosis, thrombosis, and vascular biology, 2006 Aug, Volume 26, Issue 8, p.1828-36, (2006) Abstract
Evidence for alternative candidate genes near RB1 involved in clonal expansion of in situ urothelial neoplasia., Kim, Mi-Sook, Jeong Joon, Majewski Tadeusz, Kram Andrzej, Yoon Dong-Sup, Zhang Ruo-Dan, Li Jun-Zhi, Ptaszynski Konrad, Kuang Tang C., Zhou Jain-Hua, et al. , Laboratory investigation; a journal of technical methods and pathology, 2006 Feb, Volume 86, Issue 2, p.175-90, (2006) Abstract
Evaluating potential for whole-genome studies in Kosrae, an isolated population in Micronesia., Bonnen, Penelope E., Pe'er Itsik, Plenge Robert M., Salit Jackie, Lowe Jennifer K., Shapero Michael H., Lifton Richard P., Breslow Jan L., Daly Mark J., Reich David E., et al. , Nature genetics, 2006 Feb, Volume 38, Issue 2, p.214-7, (2006) Abstract
Mining genetic epidemiology data with Bayesian networks application to APOE gene variation and plasma lipid levels., Rodin, Andrei, Mosley Thomas H., Clark Andrew G., Sing Charles F., and Boerwinkle Eric , Journal of computational biology : a journal of computational molecular cell biology, 2005, Volume 12, Issue 1, p.1-11, (2005) Abstract
Mining genetic epidemiology data with Bayesian networks I: Bayesian networks and example application (plasma apoE levels)., Rodin, Andrei S., and Boerwinkle Eric , Bioinformatics (Oxford, England), 2005 Aug 1, Volume 21, Issue 15, p.3273-8, (2005) Abstract
Highly multiplexed molecular inversion probe genotyping: over 10,000 targeted SNPs genotyped in a single tube assay., Hardenbol, Paul, Yu Fuli, Belmont John, Mackenzie Jennifer, Bruckner Carsten, Brundage Tiffany, Boudreau Andrew, Chow Steve, Eberle Jim, Erbilgin Ayca, et al. , Genome research, 2005 Feb, Volume 15, Issue 2, p.269-75, (2005) Abstract
Single nucleotide polymorphisms in genes for 2'-5'-oligoadenylate synthetase and RNase L inpatients hospitalized with West Nile virus infection., Yakub, Imtiaz, Lillibridge Kristy M., Moran Ana, Gonzalez Omar Y., Belmont John, Gibbs Richard A., and Tweardy David J. , The Journal of infectious diseases, 2005 Nov 15, Volume 192, Issue 10, p.1741-8, (2005) Abstract
SNPdetector: a software tool for sensitive and accurate SNP detection., Zhang, Jinghui, Wheeler David A., Yakub Imtiaz, Wei Sharon, Sood Raman, Rowe William, Liu Paul P., Gibbs Richard A., and Buetow Kenneth H. , PLoS computational biology, 2005 Oct, Volume 1, Issue 5, p.e53, (2005) Abstract
The soluble epoxide hydrolase gene harbors sequence variation associated with susceptibility to and protection from incident ischemic stroke., Fornage, Myriam, Lee Craig R., Doris Peter A., Bray Molly S., Heiss Gerardo, Zeldin Darryl C., and Boerwinkle Eric , Human molecular genetics, 2005 Oct 1, Volume 14, Issue 19, p.2829-37, (2005) Abstract
A haplotype map of the human genome., , Nature, 2005 Oct 27, Volume 437, Issue 7063, p.1299-320, (2005) Abstract
Variation in GRM3 affects cognition, prefrontal glutamate, and risk for schizophrenia., Egan, Michael F., Straub Richard E., Goldberg Terry E., Yakub Imtiaz, Callicott Joseph H., Hariri Ahmad R., Mattay Venkata S., Bertolino Alessandro, Hyde Thomas M., Shannon-Weickert Cynthia, et al. , Proceedings of the National Academy of Sciences of the United States of America, 2004 Aug 24, Volume 101, Issue 34, p.12604-9, (2004) Abstract
Comparison of strategies for selecting single nucleotide polymorphisms for case/control association studies., Huang, Qiqing, Fu Yun-Xin, and Boerwinkle Eric , Human genetics, 2003 Aug, Volume 113, Issue 3, p.253-7, (2003) Abstract
Investigating single nucleotide polymorphism (SNP) density in the human genome and its implications for molecular evolution., Zhao, Zhongming, Fu Yun-Xin, Hewett-Emmett David, and Boerwinkle Eric , Gene, 2003 Jul 17, Volume 312, p.207-13, (2003) Abstract
Haplotype structure, LD blocks, and uneven recombination within the LRP5 gene., Twells, Rebecca C. J., Mein Charles A., Phillips Michael S., Hess Fred J., Veijola Riitta, Gilbey Matthew, Bright Matthew, Metzker Michael, Lie Benedicte A., Kingsnorth Amanda, et al. , Genome research, 2003 May, Volume 13, Issue 5, p.845-55, (2003) Abstract
Polymorphisms at the G72/G30 gene locus, on 13q33, are associated with bipolar disorder in two independent pedigree series., Hattori, Eiji, Liu Chunyu, Badner Judith A., Bonner Tom I., Christian Susan L., Maheshwari Manjula, Detera-Wadleigh Sevilla D., Gibbs Richard A., and Gershon Elliot S. , American journal of human genetics, 2003 May, Volume 72, Issue 5, p.1131-40, (2003) Abstract
Haplotype and linkage disequilibrium architecture for human cancer-associated genes., Bonnen, Penelope E., Wang Peggy J., Kimmel Marek, Chakraborty Ranajit, and Nelson David L. , Genome research, 2002 Dec, Volume 12, Issue 12, p.1846-53, (2002) Abstract
Generalized T2 test for genome association studies., Xiong, Momiao, Zhao Jinying, and Boerwinkle Eric , American journal of human genetics, 2002 May, Volume 70, Issue 5, p.1257-68, (2002) Abstract
Neighboring-nucleotide effects on single nucleotide polymorphisms: a study of 2.6 million polymorphisms across the human genome., Zhao, Zhongming, and Boerwinkle Eric , Genome research, 2002 Nov, Volume 12, Issue 11, p.1679-86, (2002) Abstract
about seo | page