Recently, I asked my Facebook followers to send me their questions about DNA testing. One person wanted to know how AncestryDNA determines ethnicity percentages. In particular, she was interested in what regions of the genome Ancestry uses to draw these conclusions.
First, it is important to understand that Ancestry is not actually sequencing a client’s entire genome. The vast majority of DNA would not be informative, because it is the same in all people. Instead, Ancestry determines the client’s DNA sequence only at specific positions that are known to vary among different ethnic groups. These differences, which are scattered all over the genome, are called single nucleotide polymorphisms (SNPs – pronounced “snips”). Determining an individual’s sequence at a variety of SNPs is called “genotyping”.
Ancestry uses SNPs that were originally identified by comparing genomes from individuals of European, East Asian (Han Chinese and Japanese), and West African (Yoruba) ancestry. Since Ancestry wants to be able recognize other ethnicities too, they had to develop a reference panel of people from a variety of known ethnic backgrounds. To do this, they genotyped people whose ancestors all came from the same geographic region and thus were likely to descend from a single ethnic group. They also incorporated data from the public Human Genome Diversity Project (HGDP), which genotyped individuals from about 50 different populations around the world. When the SNP data from this reference panel was plotted on a graph, it formed clusters corresponding to 26 distinct geographic regions. Ancestry uses these 26 regions to define a client’s ethnicity.
When a client’s DNA is genotyped, the data is compared to the reference panel at 300,000 SNPs (the sites for which the HGDP and Ancestry’s technique both provide information). The most informative SNPs are then subjected to some high-powered statistical analysis. Basically, they calculate the predicted SNP results for all possible proportions of ethnicity and compare those predictions to the client’s actual SNP results to determine which ethnicity combination has the highest probability of producing the client’s results. The “winning” combination is reported to the client as their Ethnicity Estimate.
Obviously the results of this type of analysis are only as good as the reference panel. Ancestry has already upgraded the reference panel once (they are currently using the V2 panel) and additional improvements are in the works.
The quality of results may also vary depending on the ethnicity of the subject. Because of the SNPs that were chosen, the Ancestry ethnicity test works best for people of European ancestry. However, even some regions within Europe are difficult to distinguish due to migration and population mixing. For example, the regions defined as Great Britain and Europe West show a lot of overlap. Ancestry provides a brief history of each of the geographic regions, highlighting population movements that are likely to have affected the genetic makeup of its inhabitants.
In my personal experience, the geographical regions identified by the Ancestry Ethnicity Estimate match up fairly well with what would have been predicted based on standard genealogy. It is important to check the error bars on each region, since they are often quite large. For example, on one test that showed 6% Great Britain, the actual range is 0-21%. As Ancestry upgrades their reference panel and algorithms, these results are likely to improve.
If you are interested in reading about Ancestry’s Ethnicity Estimate in even more detail, check out the white paper describing their method.
If you have other questions about DNA testing for genealogy purposes, comment below or submit questions through the Contact form on this site. You can also send me a message through the KinSeeker Genealogy Services Facebook page. I will try to answer any questions in a future post.
I just completed a series of blogs about “The Stolen Boy”, an autobiographical story written by my great-great-great grandfather Ambrose Bowen Epperson. In the first post, I discussed a tract of land in Jackson County, Indiana that was patented by Ambrose’s father, John Epperson. The date on the patent (shown below) was 17 Dec 1821 – a year later than Ambrose claimed that his family moved to Indiana – and I mentioned that I would have to check the tract books to learn the actual date of purchase. These were books that each land office used to record transactions involving government land. After a purchase or claim was made, paperwork was sent to the General Land Office in Washington, D. C., where a patent was issued if all requirements were met.
According to the patent, John Epperson purchased his land from the land office in Jeffersonville, Indiana. At the time I wrote my post, I didn’t think that records from this office were available online. However, as I was browsing through the online databases at FamilySearch recently, I discovered that the tract books for many land offices, including Jeffersonville, are available in the collection “United States Bureau of Land Management Tract Books, 1800-c. 1955”. The contents have not been indexed, but the books can be browsed. A Wiki page provides helpful tips for using the collection.
To locate a tract book entry, you need to know the state, the land office and the legal description of the property. The entries are usually grouped by Range, then Township and then Section, but even one range may be scattered across multiple, non-consecutive volumes. Thankfully, FamilySearch provides a Coverage Table that lists the contents of each volume, making it easier to browse for the desired entry.
To find the tract book entry for John Epperson’s land, I began by browsing the images in the collection. This brought up a list of 27 states with digitized tract books, from which I chose Indiana. The available volumes were listed in numerical order, with the name of the land office in parentheses. There was a volume promisingly labeled Index A-Z (Jeffersonville), but it did not seem to include all entries and I decided to move on to the actual tract books. Most of the Indiana land offices had more than 20 volumes, so the search would have been very frustrating without the Coverage Table.
John Epperson’s patent (from the Bureau of Land Management General Land Office Records site) gave me the legal description of his property – the west half of the northwest quarter of section 27 in Township 7N, Range 6E. I used the Coverage Table to narrow down which volumes to check. This table is arranged by state, then by land office and then volume, with a description of the contents for each volume. Most Jeffersonville tract books contained several townships within a single range, so the easiest way to identify my volumes of interest was to scroll down the list looking for range 6E. I then checked the townships in that volume to see if 7N was included. Using this method, I found three volumes that might contain John Epperson’s entry. Browsing through the potential volumes was still a little confusing. For example, part way through the images for the first book (volume 6), there was a second volume 6 book cover and the entries suddenly changed to range 5E. Volume 7 seemed to pick up where the first volume 6 left off, but when I finally got to township 7N, only one section was listed before the entries moved on to range 7E! Luckily, at the top of that page, in tiny script, was a note saying that the rest of the sections from township 7N, range 6E were in Volume CC, beginning at folio 707 (volume CC was on my list, but I had not checked it yet). The note was very helpful in finding the correct township within volume CC, since the folio number was essentially the page number, printed in the upper right corner of each two-page spread.
After locating the correct township, I paged through the sections (which were in numerical order) until I found section 27. There is often more than one entry per page, sometimes for different sections, so be sure to check all entries when you get close to the section you are looking for. I located John Epperson’s entry (see below) at the top of image 185 in Book CC (Jeffersonville). The entry confirmed that John purchased 80 acres at the minimum price of $1.25 per acre, making the total cost of the land $100. Most importantly, it gave an entry date of 12 Oct 1820. This substantiates Ambrose’s claim that the Epperson family moved to Jackson County, Indiana in 1820. From browsing through the entries, it seems that a delay of a year or more between the entry date and the patent date was common at the time.
"United States Bureau of Land Management Tract Books, 1800-c. 1955." Jeffersonville, Indiana Land Office, Volume CC, image 185 FamilySearch. http://FamilySearch.org : 14 June 2016. Bureau of Land Improvement. Records Improvement, Bureau of Land Management, Washington D.C.
I had previously found another patent issued to John Epperson for land in section 21 of township 7N, range 6E. The date on this patent was 4 Oct 1824, more than a year after John died. Since the township and range were the same as John's first land entry, this entry (shown below) was located only a few pages away. I learned that the land was actually purchased 9 Apr 1822, about a year BEFORE John’s death. This brings up an important point – land patent dates should not be used as evidence that the patentee was alive at the time!
"United States Bureau of Land Management Tract Books, 1800-c. 1955." Jeffersonville, Indiana Land Office, Volume CC, image 174 FamilySearch. http://FamilySearch.org : 14 June 2016. Bureau of Land Improvement. Records Improvement, Bureau of Land Management, Washington D.C.
If I want to learn even more about these land purchases, I could request the case files from the National Archives. The price is a little steep at $50 per case file, but some files reportedly contain genealogical information. I would be particularly interested to find out if the case file for John’s 1822 land entry contains any information about his death.
If you have ancestors who purchased or claimed government land, you might want to check out this collection. You could learn the actual dates of their land transactions and how much they paid for the property. Having the actual entry date will also be helpful if you ever want to request the case file for that land entry. If you make any interesting discoveries in this collection, let me know by commenting below.
Teresa is the the owner of KinSeeker Genealogy Services. She has a Ph.D. in Biology and a lifelong fascination with genealogy. She been researching her own family history for over 20 years and loves helping others "find their stories."
Please visit the KinSeeker Genealogy Services Facebook page
This blog is owned by Teresa Shippy. Content may not be copied without permission.
©2016, copyright Teresa Shippy