1000 Genomes sequenced 1092 individuals from 14 populations across Europe, East Asia, sub-Saharan Africa, and Americas. Discovered 38 million SNPs, 1.4 million indels, 14 thousand large deletions. Low-frequency variants show substantial geographic differentiation increased by purifying selection. Rare variants below 0.5% frequency observed in single population 53% of time. Sub-Saharan Africans (YRI, LWK, ASW) carry up to three times more 0.5-5% frequency variants than Europeans or East Asians. All populations show excess rare variants from recent population growth. Europeans (IBS, FIN) carry excesses of rare variants from bottlenecks or admixture. Admixed Americans show ancestry-specific variation: African regions highest novelty (6.2%), Native American lowest variation but higher novelty. Purifying selection strongest on conserved coding sites: 85% nonsynonymous variants rare. Rare nonsynonymous-to-synonymous ratio 1-2 indicates 25-50% rare nonsynonymous are deleterious. Rare variant load varies by pathway: excess in ECM-receptor, DNA replication strongest in Europeans. Nonsynonymous variants differentiate more than synonymous below 10% frequency. Individuals carry 2500+ nonsynonymous variants at conserved sites, 150 loss-of-function variants. Rare pathological candidates: 130-400 nonsynonymous, 10-20 loss-of-function per individual. Hundreds of rare non-coding variants disrupt conserved transcription factor motifs. Europeans and East Asians show higher excess rare nonsynonymous in certain pathways than Africans.
Comments
Be the first to comment!