Discussion of Newly-Created Matrix of Genetic Distance
By US 312 Steve
Mullinax, August 2010
In this article, I introduce a tool for highlighting significant patterns of relationships
among participants in the Molyneux Surname DNA Project, which you can see on the
new IMFA web site. The figure at right is a matrix of genetic distance between pairs
of project participants. The purpose of the matrix is to make available to the whole
community of Mx family researchers an at-a-glance summary of the DNA test results.
It highlights relationships where traditional genealogical research for a common
ancestor might pay off. These will typically be pairs of participants with a close
genetic distance. On the other hand, pairs of participants with large genetic distance
between them are unlikely to share a common ancestor. See the Family Tree DNA web
page “Understanding Genetic Distance” for an explanation of how they
do distance computations. For now, let it suffice that small values of genetic distance,
zero through five, indicate a “close” genetic relationship, and the
smaller the distance, the closer the relationship. A close relationship means that
there is a high probability that a common ancestor exists within a certain number
of generations. Whether you actually find and document a common ancestor depends
on the diligence and quality of your conventional genealogical research.
The matrix is an experiment on my part. I need feedback on whether the idea is useful
to you, and how it could be made more useful. I would prefer to someday create a
“cladogram” from our results, but have so far not been successful. (The
Fitzpatrick DNA Study has used this genetic mapping technique to create a very inform-ative
ancestry diagram. See See Example)
Anyone with insight on how we could create a cladogram using our DNA results, please
contact me, steve.mullinax@comcast.net.
To create the genetic distance matrix, I copied DNA test results from our project
site’s results matrix dated April 15, 2010, the latest compilation available at
this writing. I removed all participants with less than 37-marker results. This
enables an apples-to-apples comparison of high-resolution results. (It should also
be possible to create a matrix of 25-marker-plus results.) I created a macro in
Excel which compares each possible pair of participants and determines the genetic
distance between them. Genetic distance is defined as the sum of the differences
of each of 37 pairs of STR numbers for the pair of participants.
Description of the genetic distance matrix: First, be aware that the participants
are in the same order as on the web site’s results matrix. They are grouped by haplogroup.
(Y-chromosome haplogroups are the major divisions of male lines of descent, with
ties to human migration and settlement patterns originating several thousand years
ago. The haplogroups identified among our project participants so far include E,
I and R1b. For more information, see the article
“Human Y-chromosome DNA haplogroup,”) The kit numbers
for participants are in the left-most column. They are repeated, in the same order,
across the top row. The second column gives the name and residence of the participant’s
earliest documented ancestor. The names are repeated in the second row. Note that
the line for the first participant (Kit 152327) is intentionally dropped, as is
the column for the last participant (Kit 102563). This removes a row and a column
which would show nothing other than each of these participants’ relationship
with himself. All relationships for 102563 are shown in the last row of the matrix,
and all those for 152327 are shown in the first column.
There is a stair-step pattern descending from the upper left to the lower right
corner of the matrix. Cells on the lower left of that stair-step indicate a close
genetic distance if they contain a number zero through five. If they contain no
number, they indicate a more distant genetic relationship, and thus are not genealogically
significant. Cells to the upper right of the stair-step are not significant.
The cell at the intersection of one participant (kit #) in a row and another
participant in an intersecting column is the genetic distance between that pair
of participants. (E.g. the intersection of the row for kit # 63130 and the column
for kit# 152327 shows the number 2 indicating a distance of 2 between these
two participants.)
This matrix pinpoints the close relationships among participants for the surname
project as a whole. It also allows us to focus on close genetic relationships to
see if they are supported by traditional documentation, i.e., the paper trail shared
with IMFA.
Let’s look at a few of the zero-distance pairs in the table below.
|
Kit#
|
Pedigree?
|
Kit#
|
Pedigree
|
Common Ancestor identified
|
|
46203
|
Adam Molyneux (UK014)
|
45732
|
Greenbury W. Mullinix
|
None
|
|
58548
|
Jonothan Mullinix Sr (US285s)
|
45732
|
Greenbury W. Mullinix
|
Wayne Straight (US blah) has preliminary research strongly suggesting a connection.
(See below.)
|
|
58548
|
Jonothan Mullinix Sr (US285s)
|
46203
|
Adam Molyneux (UK014)
|
None
|
After seeing a draft of this article, Wayne Straight researched whether kits 58548
and 45732 might be linked. Kit 58548’s pedigree traces back through Greenberry Mullinix
b. 1771, to Jonothan Mullinix Sr, b. 1705 (England). Wayne wrote, “It appears that
Jim Mx (US301) is one of what Marilyn Blanck calls the ‘Greenberry Mx's’, about
whom there's a lot of documentation. So it shouldn't be an insurmountable task to
do a genealogical comparison. This family is descended from Jonathan Mx1 of Elkridge,
MD, the same family as Don Mulinix (US236).” Wayne and Don collaborated on an article
documenting this family’s history and genealogy, which is posted on the
IMFA Wiki web site. Wayne
also found that “according to several family trees on Ancestry … this Greenberry
[kit 45732’s ancestor] is the grandson of the first Greenberry Mx, via Greenberry
Mx1's son Elisha Mx..” Thus, from Wayne’s research, there is strong preliminary
evidence that kits 58548 and 45732 are linked. Additional research might solidify
this link.
It would also be interesting to research whether these two families might be connected
to the descendants of Adam Molyneux, b ??; d 1726, St. Ebbs, Oxford (kit 46203),
with whom they share a zero genetic distance relationship.
These observations suggest possible actions for research:
- Research to extend the pedigrees.
- Addition of descendant trees to pedigrees.
You can also see that the matrix contains sixteen pairs with a genetic distance
of 1. I could create a similar table to the one above which would suggest more avenues
for research.
When we identify pairs which have close genetic distances and for which documentation
already exists, the genetic results add weight to the documentary proof. For example,
kits #48424 and N33141 have a common documented ancestor in “Levi Mullinax, b c
1775, TN; d c Sep 1818, Wilson Co., TN”. The participants’ genetic distance of 3
confirms a close connection, buttressing the documentation. Note that these close
genetic distances do not prove that the documented common ancestor is correct. Only
rigorous analysis of the whole of the genealogical evidence can do that.
Three triangular groupings are apparent in the matrix, consistent with our ordering
by haplogroup. E, I, R1b. Note that there are no close relationships crossing between
haplogroups, which is as we would expect, and gives some validation to the DNA results
and to the matrix.
Again, I would appreciate any comments on the usefulness of this approach and how
it could be improved. Steve Mullinax,
, 503.768.9065.
Thanks to Wayne Straight and Marie Spearman for their comments and contributions
to this article.