The diversity panel is a large and widely-used collection of DNA samples
from people distributed around the world. Several of our papers have
utilized genotypes from the diversity panel. Here we provide HGDP-CEPH
data exactly as used in these papers.
Note that slightly different versions of our microsatellite and
indel data sets are located at the website of the Marshfield Clinic Research Foundation. In cases where it is of
interest to compare new results on the diversity panel to what has
been seen in our previous work, we recommend using the files
downloadable from this site, rather than those available in Microsoft
Excel from Marshfield.
Further information about the microsatellite markers, such as PCR
primers and map positions, are available from
Marshfield.
HGDP+other 2013 microsatellites (645 autosomal microsatellite
loci in 5795 individuals from 267 populations)
(Posted July 17, 2013) HGDP microsatellite data plus data from other major
microsatellite datasets (human and chimp) are now available online for
TJ Pemberton, M DeGiorgio, NA Rosenberg (2013)
Population structure in a comprehensive data set on human microsatellite
variation. G3: Genes, Genomes, Genetics 3: 891-907.
[Abstract]
[Full-text at
journal website]
[PDF]
[Supplement]
Download SNP data
(you
will be directed first to a registration page and we would very much
appreciate if you register)
HGDP 2008 high-resolution genome-wide SNP data (525,910
single-nucleotide polymorphisms and 1428 copy-number variable loci in
485 individuals from 29 populations
(Posted Feb 26, 2008) HGDP SNP data are now available online for
M Jakobsson*, SW Scholz*, P
Scheet*, JR Gibbs, JM VanLiere, H-C Fung, ZA
Szpiech, JH Degnan, K Wang, R Guerreiro, JM Bras, JC
Schymick, DG Hernandez, BJ Traynor, J Simon-Sanchez, M Matarin, A
Britton, J van de Leemput, I Rafferty, M Bucan, HM Cann, JA Hardy,
NA Rosenberg, AB Singleton (2008) Genotype, haplotype and
copy-number variation in worldwide human populations. Nature
451: 998-1003. [Abstract]
[PDF]
Download SNP data (you
will be directed first to a registration page and we would very much
appreciate if you register)
HGDP 2006 relatives
(Posted October 17, 2006) It is recommended
that anyone working with the diversity panel read the following paper,
which reports a variety of anomalies in the diversity panel
individuals and recommends standard subsets for future use.
(Posted November 1, 2005) The following data files, all in plain text
format, were reported by two papers appearing nearly simultaneously. The
microsatellite markers are drawn from Marshfield screening sets 10, 13,
and 52, and the indels are drawn from Marshfield screening set 100. A
description of how these data files differ from those on the Marshfield
site is in the Ramachandran et al. (2005) and Rosenberg et
al. (2005) papers.
In choosing data files for analysis, note that there are slight
differences between the data used by Ramachandran et al. (2005)
and those used by Rosenberg et al. (2005). Our uses in the lab employ
the Rosenberg et al. (2005) version.
S Ramachandran, O Deshpande, CC Roseman, NA Rosenberg, MW
Feldman, LL Cavalli-Sforza (2005) Support from the relationship of genetic
and geographic distance in human populations for a serial founder effect
originating in Africa. Proceedings of the National Academy of Sciences
USA 102:
15942-15947. [Abstract]
[PDF]
[Supplementary Figure 6]
[Supplementary Table 2]
[Supplementary text]
NA Rosenberg, S Mahajan, S Ramachandran, C Zhao, JK
Pritchard, MW Feldman (2005) Clines, clusters, and the effect of study
design on the inference of human population structure. PLoS
Genetics 1: 660-671.
[Abstract]
[Full-text at journal website]
[PDF]
Population
codes — list of codes used in the structure- and
NEXUS-formatted data files.
Loci
— conversion between "locus names" in the data files and "marker
names" in the files provided by Marshfield. This file also includes
the length of the repeated unit of DNA (2, 3, or 4 base pairs).
Readme
— further description of the five previous files.
History
Created with 377 microsatellites, 22 November 2002
Addition of NEXUS file for 377 microsatellites, 28 December 2002
Minor modifications to site, 30 April 2004
Addition of data on 783 microsatellites and 210 indels, 1 November 2005
Addition of standardized subsets of individuals, 17 November 2006
Addition of SNP data from Conrad et al. (2006), 23 May 2007
Addition of genome-wide SNP and copy-number data, 26 February 2008
Addition of SNP data from Pemberton et al. (2008), 27 June 2008
Addition of sequence properties of microsatellites, 21 January 2010
Addition of data from Huang et al (2011), Pemberton et al. (2013), and
Szpiech et al. (2013), 17 July 2013
Site substantially modified to improve readability, 17 July 2013