Skip to Content

SNP Discovery

BCM-HGSC Mutation Discovery Pipeline

Single Nucleotide Mutations

Sequencing reads are compared with their respective amplicon reference sequences using a modification of SNPDetector, which employs a relaxed Het peak ratio threshold to compensate for possible heterogeneity of the tumor tissue sample. We use Polyphred 6.0b as a backup discovery method and capture any high-scoring variation missed (very rarely) by SNPDetector. Special mention should be made of our collaboration with Dr. Jinghui Zhang in the laboratory of Dr Ken Beautow, on the development and calibration of SNPdetector. This software has consistently outperformed other routines for the direct discovery of heterozygotes.

Putative polymorphisms accumulated from this analysis are annotated with the following information:

  1. Chromosome and global coordinates

  2. Coincidence with known variation (dbSNP current build, as well as local databases of newly identified variants)

  3. Functional information:

    1. gene compartment (intron, exon, splice junction)

    2. non-synonymous amino acid change if any

    3. position of non-synonymous amino acid in protein

    4. BLOSSUM62 score of variant amino acid compared to reference

Novel SNPs with recognizable functional potential (e.g., non-synonymous SNP or splice junctional variants) are further evaluated. First, they are visually inspected at the trace level and those that are not clearly noise are passed on to experimental validation, currently pyro-sequencing. We plan to resequence with Sanger reads the matched normal tissue in patients with mutations passing pyro-sequencing validation.

All putative genotypes of each individual at each mutation position, along with annotation and validation status will be stored in local databases. Reports are formatted for submission to common data repositories according to protocols jointly established.

Structural Variation Discovery

We are in the process of evaluating Polyphred 6.0b and a new module for SNPDetector designed for detecting intra-exonic indels in biallelic resequencing traces. One or both of these will be used for indel discovery and characterization. Genotype frequencies of constitutional variants (i.e., known SNPs) will be tracked since they might reveal commonly deleted genes or gene segments (LOH) through departures fro, Hardy-Weinberg equilibrium.

Quality Control

Sequencing coverage is a critical factor leading to variation discovery. We track coverage using the SNPDetector program rather than by a single base quality measure. Bases are judged to be covered in a given read if SNPDetector is able to make a call at any given position regardless of their Phred quality score (although there is a high correlation between Phred quality score and SNPDetector coverage).



Related Publications

A weighted false discovery rate control procedure reveals alleles at FOXA2 that influence fasting glucose levels., Xing, Chao, Cohen Jonathan C., and Boerwinkle Eric , American journal of human genetics, 2010 Mar 12, Volume 86, Issue 3, p.440-6, (2010) Abstract
Large-scale genomic studies reveal central role of ABO in sP-selectin and sICAM-1 levels., Barbalic, Maja, Dupuis Josée, Dehghan Abbas, Bis Joshua C., Hoogeveen Ron C., Schnabel Renate B., Nambi Vijay, Bretler Monique, Smith Nicholas L., Peters Annette, et al. , Human molecular genetics, 2010 May 1, Volume 19, Issue 9, p.1863-72, (2010) Abstract
Integrating common and rare genetic variation in diverse human populations., Altshuler, David M., Gibbs Richard A., Peltonen Leena, Altshuler David M., Gibbs Richard A., Peltonen Leena, Dermitzakis Emmanouil, Schaffner Stephen F., Yu Fuli, Peltonen Leena, et al. , Nature, 2010 Sep 2, Volume 467, Issue 7311, p.52-8, (2010) Abstract
The INSIG2 rs7566605 genetic variant does not play a major role in obesity in a sample of 24,722 individuals from four cohorts., Bressler, Jan, Fornage Myriam, Hanis Craig L., Kao Wen Hong Linda, Lewis Cora E., McPherson Ruth, Dent Robert, Mosley Thomas H., Pennacchio Len A., and Boerwinkle Eric , BMC medical genetics, 2009, Volume 10, p.56, (2009) Abstract
Bos taurus genome assembly., Liu, Yue, Qin Xiang, Song Xing-Zhi Henry, Jiang Huaiyang, Shen Yufeng, Durbin James K., Lien Sigbjørn, Kent Matthew Peter, Sodeland Marte, Ren Yanru, et al. , BMC genomics, 2009, Volume 10, p.180, (2009) Abstract
Sequencing the full-length of the phosphatase and tensin homolog (PTEN) gene in hepatocellular carcinoma (HCC) using the 454 GS20 and Illumina GA DNA sequencing platforms., Rodriguez, Joel A., Guiteau Jacfranz J., Nazareth Lynne, Reid Jeff G., Goss John A., Gibbs Richard A., and Gingras Marie-Claude , World journal of surgery, 2009 Apr, Volume 33, Issue 4, p.647-52, (2009) Abstract
Fine mapping of chromosome 6q23-25 region in familial lung cancer families reveals RGS17 as a likely candidate gene., You, Ming, Wang Daolong, Liu Pengyuan, Vikis Haris, James Michael, Lu Yan, Wang Yian, Wang Min, Chen Qiong, Jia Dongmei, et al. , Clinical cancer research : an official journal of the American Association for Cancer Research, 2009 Apr 15, Volume 15, Issue 8, p.2666-74, (2009) Abstract
Genome-wide survey of SNP variation uncovers the genetic structure of cattle breeds., Gibbs, Richard A., Taylor Jeremy F., Van Tassell Curtis P., Barendse William, Eversole Kellye A., Gill Clare A., Green Ronnie D., Hamernik Debora L., Kappes Steven M., Lien Sigbjørn, et al. , Science (New York, N.Y.), 2009 Apr 24, Volume 324, Issue 5926, p.528-32, (2009) Abstract
Single nucleotide polymorphism in RECQL and survival in resectable pancreatic adenocarcinoma., Cotton, Ronald T., Li Donghui, Scherer Steven E., Muzny Donna M., Hodges Sally E., Catania Robbi L., Witkiewicz Agnieszka K., Brody Jonathan R., Kennedy Eugene P., Yeo Charles J., et al. , HPB : the official journal of the International Hepato Pancreato Biliary Association, 2009 Aug, Volume 11, Issue 5, p.435-44, (2009) Abstract
Use of wrapper algorithms coupled with a random forests classifier for variable selection in large-scale genomic association studies., Rodin, Andrei S., Litvinenko Anatoliy, Klos Kathy, Morrison Alanna C., Woodage Trevor, Coresh Josef, and Boerwinkle Eric , Journal of computational biology : a journal of computational molecular cell biology, 2009 Dec, Volume 16, Issue 12, p.1705-18, (2009) Abstract
GOSR2 Lys67Arg is associated with hypertension in whites., Meyer, Tamra E., Shiffman Dov, Morrison Alanna C., Rowland Charles M., Louie Judy Z., Bare Lance A., Ross David A., Arellano Andre R., Chasman Daniel I., Ridker Paul M., et al. , American journal of hypertension, 2009 Feb, Volume 22, Issue 2, p.163-8, (2009) Abstract
Glucocorticoid receptor gene variant in the 3' untranslated region is associated with multiple measures of blood pressure., Chung, Charles C., Shimmin Lawrence, Natarajan Sivamani, Hanis Craig L., Boerwinkle Eric, and Hixson James E. , The Journal of clinical endocrinology and metabolism, 2009 Jan, Volume 94, Issue 1, p.268-76, (2009) Abstract
Inferring population mutation rate and sequencing error rate using the SNP frequency spectrum in a sample of DNA sequences., Liu, Xiaoming, Maxwell Taylor J., Boerwinkle Eric, and Fu Yun-Xin , Molecular biology and evolution, 2009 Jul, Volume 26, Issue 7, p.1479-90, (2009) Abstract
Gene by smoking interaction in hypertension: identification of a major quantitative trait locus on chromosome 15q for systolic blood pressure in Mexican-Americans., Montasser, May E., Shimmin Lawrence C., Hanis Craig L., Boerwinkle Eric, and Hixson James E. , Journal of hypertension, 2009 Mar, Volume 27, Issue 3, p.491-501, (2009) Abstract
Mutation survey of known LCA genes and loci in the Saudi Arabian population., Li, Yumei, Wang Hui, Peng Jianlan, Gibbs Richard A., Lewis Richard Alan, Lupski James R., Mardon Graeme, and Chen Rui , Investigative ophthalmology & visual science, 2009 Mar, Volume 50, Issue 3, p.1336-43, (2009) Abstract
Resources for genetic management and genomics research on non-human primates at the National Primate Research Centers (NPRCs)., Kanthaswamy, S., Capitanio J. P., Dubay C. J., Ferguson B., Folks T., Ha J. C., Hotchkiss C. E., Johnson Z. P., Katze M. G., Kean L. S., et al. , Journal of medical primatology, 2009 Oct, Volume 38 Suppl 1, p.17-23, (2009) Abstract
Common and rare variants of DAOA in bipolar disorder., Maheshwari, Manjula, Shi Jiajun, Badner Judith A., Skol Andrew, Willour Virginia L., Muzny Donna M., Wheeler David A., Gerald Fowler R., Detera-Wadleigh Sevilla, McMahon Francis J., et al. , American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics, 2009 Oct 5, Volume 150B, Issue 7, p.960-6, (2009) Abstract
PrimerSNP: a web tool for whole-genome selection of allele-specific and common primers of phylogenetically-related bacterial genomic sequences., Yao, Jiqiang, Lin Hong, Van Deynze Allen, Doddapaneni Harshavardhan, Francis Martha, Lemos Eliana Gertrudes Macedo, and Civerolo Edwin L. , BMC microbiology, 2008, Volume 8, p.185, (2008) Abstract
Comparative phylogenomics and multi-gene cluster analyses of the Citrus Huanglongbing (HLB)-associated bacterium Candidatus Liberibacter., Doddapaneni, Harshavardhan, Liao Huihong, Lin Hong, Bai Xianjin, Zhao Xiaolong, Civerolo Edwin L., Irey Michael, Coletta-Filho Helvecio, and Pietersen Gerhard , BMC research notes, 2008, Volume 1, p.72, (2008) Abstract
Single nucleotide polymorphisms associated with coronary heart disease predict incident ischemic stroke in the atherosclerosis risk in communities study., Morrison, Alanna C., Bare Lance A., Luke May M., Pankow James S., Mosley Thomas H., Devlin James J., Willerson James T., and Boerwinkle Eric , Cerebrovascular diseases (Basel, Switzerland), 2008, Volume 26, Issue 4, p.420-4, (2008) Abstract
Genetic architecture of adiposity and organ weight using combined generation QTL analysis., Fawcett, Gloria L., Roseman Charles C., Jarvis Joseph P., Wang Bing, Wolf Jason B., and Cheverud James M. , Obesity (Silver Spring, Md.), 2008 Aug, Volume 16, Issue 8, p.1861-8, (2008) Abstract
Comprehensive evaluation of apolipoprotein H gene (APOH) variation identifies novel associations with measures of lipid metabolism in GENOA., Leduc, Magalie S., Shimmin Lawrence C., Klos Kathy L. E., Hanis Craig, Boerwinkle Eric, and Hixson James E. , Journal of lipid research, 2008 Dec, Volume 49, Issue 12, p.2648-56, (2008) Abstract
Regional association-based fine-mapping for sodium-lithium countertransport on chromosome 10., Morrison, Alanna C., Boerwinkle Eric, Turner Stephen T., and Ferrell Robert E. , American journal of hypertension, 2008 Jan, Volume 21, Issue 1, p.117-21, (2008) Abstract
Peroxisome proliferator-activated receptor [alpha] genetic variation interacts with n-6 and long-chain n-3 fatty acid intake to affect total cholesterol and LDL-cholesterol concentrations in the Atherosclerosis Risk in Communities Study., Volcik, Kelly A., Nettleton Jennifer A., Ballantyne Christie M., and Boerwinkle Eric , The American journal of clinical nutrition, 2008 Jun, Volume 87, Issue 6, p.1926-31, (2008) Abstract
What everybody should know about the rat genome and its online resources., Twigger, Simon N., Pruitt Kim D., Fernández-Suárez Xosé M., Karolchik Donna, Worley Kim C., Maglott Donna R., Brown Garth, Weinstock George, Gibbs Richard A., Kent Jim, et al. , Nature genetics, 2008 May, Volume 40, Issue 5, p.523-7, (2008) Abstract
about seo | page