The study provides clues to medical conditions in people of sub-Saharan African ancestry, and indicates that the migration from Africa in the early days of the human race was followed by a migration back into the continent.
The study, published in the Dec. 3, 2014, online issue of Nature, is the first to use both dense genotyping and whole-genome sequence data to explore the genomic-variation landscape of several African ethno-linguistic groups.
A genotype is a subset of a whole-genome sequence, which is the complete read-out of an individual's DNA. Dense genotyping packs millions of test points onto a microarray-a laboratory test chip-that can detect genomic signatures in an individual's DNA. Researchers assemble data from genotyping or from whole-genome sequencing to identify the particular pattern of DNA differences that makes each individual unique.
"The rich genomic diversity in sub-Saharan African populations can offer new insights about disease susceptibility that could easily be overlooked using less-tailored analyses," said Dan Kastner, M.D., Ph.D, scientific director of the NIH's National Human Genome Research Institute. "This is an important study that demonstrates how a one-size-fits-all approach is not always best when it comes to population genomics."
The research team identified novel evidence of how diverse local environmental forces, such as climate and exposure to infectious agents, have shaped the genomes of Africans and their susceptibility to many conditions, including malaria, Lassa fever and trypanosomiasis. The genetic variant frequency in populations from endemic and non-endemic regions suggests that this effect may be in response to the different environments these populations have been exposed to over time.
The researchers have begun contributing genomic data from African populations to international databases, and have proposed African-specific population studies that will take advantage of the rich genomic diversity among sub-Saharan African populations. Despite being among the most genomically diverse people in the world, Africans are underrepresented in genetics and genomics research.
"To date, African populations and their scientists have not participated fully in the ongoing global effort to use genomics to understand human history and health," said senior co-author Charles Rotimi, Ph.D., director of the NIH Center for Research on Genomics and Global Health (CRGGH). "Researchers have carried out only a modest number of genome-wide association studies, or GWAS, using continental African populations, while thousands have been conducted that include populations with mainly European ancestry."
The GWAS approach involves rapidly scanning markers across the complete sets of DNA, or genomes, of many people to find genetic variations associated with a particular disease. Once new genetic associations are identified, researchers can use the information to develop better strategies to detect, treat and prevent the disease. Such studies are particularly useful in finding genetic variations that contribute to common, complex diseases, such as asthma, cancer, diabetes, heart disease and mental illnesses.
Along with clues about disease susceptibility among sub-Saharan Africans, the researchers also documented patterns of Eurasian-African population mixture and differentiation, echoing thousands of years of human population history. As humans migrated out of Africa, they carried in their genomes a subset of African ancestral genomic variation. The structure of subsequent non-African populations reflected less genomic variation, so that today, populations outside of Africa tend to be less genomically diverse. The presence of Eurasian genomic mixture in present-day African populations provides evidence of reverse migration back to Africa from Europe and other parts of the world, the researchers say.
Dr. Rotimi explained that the few studies that have been conducted among African-ancestry populations have used less efficient genotyping platforms that were designed, for the most part, with European-ancestry populations in mind.
In their study, the researchers obtained generated data from 1,481 individuals from 18 ethno-linguistic groups across seven African countries using a genotyping array that tested 2.5 million sites in the genome. In addition, they included whole-genome sequence data from 320 individuals representing seven ethno-linguistic groups from South Africa, Uganda and Ethiopia. From the whole-sequence data, the researchers detected 30 million genomic variants, 24 percent of which were not present in other populations. The detection of previously undetected African genomic variants underscores the importance of sequencing more genomes from Africans.
The team then added their research information to the available data from the 1000 Genomes Project for subsequent analyses. The 1000 Genomes Project is an international public-private consortium producing an extensive catalog of human genomic variation as a resource for medical research.
"By sequencing the genomes from additional individuals from Africa who were not originally participants in the 1000 Genomes Project, we were able to improve the accuracy of estimating genomic variants missing in the genotyping array," said Fasil Tekola-Ayele, Ph.D., co-lead author and a CRGGH research fellow. "Just two African populations had been represented in the first phase of the 1000 Genomes Project, but we have added genotype information from 18 African population groups and whole-genome sequence from seven groups, each from a specific ethno-linguistic group."
The authors have proposed the development of an African-specific microarray of just 1 million genomic variants, compared to twice that number built into the array platform used in the current study. The array would include novel genomic variants from their study and also reflect the wide genomic diversity in the continent that is not found in Africans alone.
"We can have a very good, inexpensive array that can be used for future genome-wide association studies," Dr. Tekola-Ayele said. "Having this array would aid African researchers with ongoing genomic studies."
The current study is part of the researchers' work on the African Genome Variation Project (AGVP), a collaboration begun in 2011 to map the genomic variation landscape of African populations and to enable design of large-scale GWAS in the region. "The African Genome Variation Project is facilitating the development of local resources for epidemiology and genomic research and is strengthening research capacity, training, and collaboration across the African continent," Dr. Rotimi said.r Sandhu, Ph.D., co-senior author from the Wellcome Trust Sanger Institute in England and the Department of Medicine at the University of Cambridge, "To better understand the genetic landscape of Africa we need to study modern genetic sequences from previously under-studied African populations, along with ancient DNA from archaeological sources."
All data assembled by the African Genome Variation Project will be available to all investigators around the world and will be deposited in the European Genome Archive and the H3Africa Bionet.
NHGRI is one of the 27 institutes and centers at the NIH, an agency of the Department of Health and Human Services. The NHGRI Division of Intramural Research develops and implements technology to understand, diagnose and treat genomic and genetic diseases. Additional information about NHGRI can be found at its website, www.genome.gov.
About the National Institutes of Health (NIH): NIH, the nation's medical research agency, includes 27 institutes and centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit www.nih.gov.
Contact
Raymond MacDougall
National Human Genome Research Institute
301-443-3523
macdougallr@mail.nih.gov