|Title||Investigating single nucleotide polymorphism (SNP) density in the human genome and its implications for molecular evolution.|
|Publication Type||Journal Article|
|Year of Publication||2003|
|Authors||Zhao, Z, Fu, Y-X, Hewett-Emmett, D, Boerwinkle, E|
|Date Published||2003 Jul 17|
|Keywords||Databases, Nucleic Acid, DNA, Intergenic, Evolution, Molecular, Exons, Gene Frequency, Genome, Human, Humans, Introns, Mutation, Polymorphism, Single Nucleotide|
We investigated the single nucleotide polymorphism (SNP) density across the human genome and in different genic categories using two SNP databases: Celera's CgsSNP, which includes SNPs identified by comparing genomic sequences, and Celera's RefSNP, which includes SNPs from a variety of sources and is biased toward disease-associated genes. Based on CgsSNP, the average numbers of SNPs per 10 kb was 8.33, 8.44, and 8.09 in the human genome, in intergenic regions, and in genic regions, respectively. In genic regions, the SNP density in intronic, exonic and adjoining untranslated regions was 8.21, 5.28, and 7.51 SNPs per 10 kb, respectively. The pattern of SNP density based on RefSNP was different from that based on CgsSNP, emphasizing its utility for genotype-phenotype association studies but not for most population genetic studies. The number of SNPs per chromosome was correlated with chromosome length, but the density of SNPs estimated by CgsSNP was not significantly correlated with the GC content of the chromosome. Based on CgsSNP, the ratio of nonsense to missense mutations (0.027), the ratio of missense to silent mutations (1.15), and the ratio of non-synonymous to synonymous mutations (1.18) was less than half of that expected in a human protein coding sequence under the neutral mutation theory, reflecting a role for natural selection, especially purifying selection.