Catching Up With Dr. Ketchum’s Bigfoot Research

Chastising the scientific community for not peer-reviewing one’s paper on Bigfoot? Be careful what you wish for, cryptozoologists, this can be a BAD IDEA. – J. Vaeni


Review of “Novel North American Hominins, Next Generation Sequencing of Three Whole Genomes and Associated Studies,” Ketchum et al., 2012

by Guest Blogger,
D. Ellen K. Tarr, Ph.D.
Associate Professor
Midwestern University
Glendale, AZ

When this paper first came out, I was too busy to review it and thought that by the time I was able to write a review, there wouldn’t be anything left to say. However, defense of the article continued longer than I thought it would, and some criticized the scientific community for their unwillingness to peer-review the paper. This is a valid criticism, and my colleague, Dr. Tyler Kokjohn, has pointed out that there are members of the scientific community that would have been willing to review the paper. For non-scientist defenders of Dr. Ketchum’s work, I would like to point out that this is not a trivial undertaking. The specifics of the review format vary by journal, but in general, a review includes a list of major changes that need to be made before the manuscript can be accepted, minor changes that should be made, and discretionary changes that are basically suggestions to the authors. Since I’m not officially reviewing this for a particular journal, what follows is a more informal presentation of my assessment of the manuscript than I would usually give. My goal is to present my criticisms as clearly as possible so that even non-biologists will be convinced that this manuscript is simply not suitable for publication in its present form. I am not an expert in many of the techniques discussed, and I clearly point out the limitations of my ability to comment on various aspects of the study. For convenience, the organization of the review parallels the organization of the manuscript.


The paper begins with a brief history of Sasquatch/Bigfoot reports and a consensus description. I’m not going to dispute the assertion that large, hirsute, human-like creatures have been described throughout human history. I’m also not going to attempt to determine whether or not these descriptions are all referring to the same species or not. If there had not been a large number of reports with significant similarities, there wouldn’t be a community of individuals devoted to this topic and I wouldn’t be reviewing this paper. While the existence of Sasquatch/Bigfoot reports is undeniable, the authors state, “The failure to present a deceased individual or skeletal remains has exacerbated scientific skepticism towards the circumstantial evidence.” Yes, it certainly has, but this manuscript isn’t aimed at filling this gap in the evidence. The introduction would benefit from a clear explanation of what gaps in the evidence the authors intend to address and what contribution they intend to make to the field.

The authors go on to say, “There are reports from witnesses who allegedly live in close proximity to, and interact with, Sasquatch family units,” and this certainly sounds promising. Naturally, I checked the reference (number 14), which turns out to be a website for “Native American Sasquatch Names” (

Uh, what?

patterson_bigfootYou tell me there are people actually having contact with Sasquatch families and then send me to a list of names? I’m sure I’m not alone in thinking that “live in close proximity to, and interact with, Sasquatch family units” implies that there is an ongoing, friendly (or at least civil) relationship between the humans and Sasquatch. Maybe they get together for games of Charades, or Twister, or Cards Against Humanity. Perhaps Mrs. Sasquatch occasionally asks to borrow a cup of sugar. I don’t know, but a list of Native American names for something that could be interpreted as Sasquatch certainly does not provide evidence for the kind of interaction implied by the statement. I am not an expert in Native American traditions, and I accept that many tribes have stories about–and possibly interact with–beings that could be described as Sasquatch, but a list of names is not sufficient to establish that. The authors simply fail to adequately support their statement. It might be nice to see some additional information from the individuals who submitted samples 130-136, which are listed as coming from Sasquatch named Fox, Squeaky, Nicki, Toby, and Wenebojo (Table 1). The pdf also shows an image of a supposed female Sasquatch asleep in the woods, but all you can see is a furry figure lying in a heap. The supplemental video seems to show the Sasquatch breathing, but there is no way to rule out that this is simply a human in a costume.

Next, we have the statement that “approximately one hundred and thirteen samples… were submitted by dozens of individuals and groups from thirty four separate hominin collection sites around North America,” and we are directed to Table 1. The first problem I have with Table 1 is that there are two sets of numbers. The left-hand column is numbered 1-111, and while 111 is “approximately 113,” it would have been better to say that “one hundred and eleven samples were submitted.” To complicate things, the fifth column is labeled “Sample #, Type.” These numbers begin at 1, end at 168, but are not all inclusive while also including multiple samples with a number that are then differentiated by adding letters (e.g., 119a-d). The two sets of numbers can be confusing when attempting to identify a sample that is mentioned in the text. The gaps in sample numbering imply that some samples were excluded from the table. It isn’t necessarily a problem that they were excluded, but the reason for their exclusion is not clear. Table 1 needs to be revised to include all samples that were submitted, reasons for exclusion (for samples that were excluded), and the data available for each sample that is included. For example, I currently have no way to easily determine if DNA sequence was obtained for Sample 8. The tests performed and the available data for each sample should be clearly presented.

Materials and Methods (provided as a supplementary file)

Hair Analysis

I don’t know anything about hair analysis, but I was interested in the statement, “Only hairs that were not human in appearance and could not be identified as any other species were utilized in this study.” How many of the original samples that were submitted were excluded because they could be identified? Although I realize further characterization of these isn’t warranted, including the information would be useful for a couple of reasons. First, it would add credibility by showing that the study methodology is successful in identifying hair from known sources. Given the large number of samples submitted, some cases of mistaken identity are to be expected, and I for one would like to see some evidence that these researchers are able to identify things like dog hair. Second, it would provide information about which types of hair are most likely to be misidentified as Sasquatch hair. Third, while many individuals participating in the study have honest intentions, consistent submission of easily identifiable hair by the same individual or group may be a red flag.

The rest of the “Hair Analysis” section initially seems fairly detailed, but as far as I can tell, the data is not presented. “Hairs were initially examined with an illuminated magnifying lens to observe characteristics such as coarseness, hair shaft profile and color.” By my count, 98 of the samples listed included hair, which implies that coarseness, hair shaft profile, and color” should have been recorded for 98 samples. Again, I don’t know anything about hair analysis, but this information should be available as a supplementary file. The rest of the “Hair Analysis” section mentions “select hairs.” It isn’t clear if this is referring to a few individual hairs taken from each sample or if only some of the samples “were examined for a variety of microscopic features such as: medulla, pigmentation, cortical fusi, ovoid bodies, cuticle, and root and tip characteristics.” Hello—you’re trying to prove you have Sasquatch hair. Being thorough matters! Hair from every sample needs to be examined and photographed. I know it’s a lot of work, but you’re trying to convince me that at least some of these samples came from Bigfoot. I’m sorry, but going the extra mile isn’t an option for you.

DNA Isolation

This section looks fairly standard, and the precautions taken to rule out DNA contamination from the individuals collecting and handling the samples seemed sufficient.

Gender and Short Tandem Repeat (STR) Analysis

Most of this section seems standard, but since I am not an expert in STR analyses, I would defer to someone more experienced if they felt there were additional details that needed to be included in this section. In addition to the PowerPlex® 16 kit, this section mentions the AmpFLSTR Yfiler kit for 17 Y-STR loci. As far as I can tell, no Y-chromosome STR results are presented in the paper. The authors should either add results from this analysis or delete the Y- STR information from the methods.

Electron Microscopy

I have no experience with electron microscopy and therefore, no basis for evaluating whether the detail given here is sufficient. However, this is another section that mentions “selected samples,” but with no criteria given for how samples were chosen. I think this is also the only mention of Sample 26 being “obtained from a tissue sample collected after the shooting of an unknown hominid.” What? Someone is trying to shoot Sasquatch? Perhaps there should be some discussion about appropriate methods of sample collection.

Mitochondrial DNA Sequencing

This section mentions species identification as one of the goals, but it is unclear if any primers other than those corresponding to human DNA were used. The authors should clarify what attempts were made to identify species other than human.

Nuclear DNA Analysis

This section mentions the loci that were sequenced and that SeqWright designed proprietary primers for several of these loci. It is customary to provide primer sequences, and although some are listed in Supplemental Data 12, they are not all provided, which would make it more difficult if other researchers wanted to use the same protocol to analyze other potential Sasquatch samples.

Whole Human Genome SNP analysis

Since the samples were sent away for the SNP testing, the limited amount of information presented here is reasonable. I would have liked DNA concentrations to be given somewhere in the manuscript, either here or when results are presented.

Whole Genome Sequencing

Again, the limited information presented is suitable given that the samples were sent away for whole genome sequencing. I find the sentence, “Custom scripts were used to extract reads showing good alignment to specific chromosomes of the reference genome” to be problematic. The “specific chromosomes” are not listed and there is no rationale for why they were chosen. The parameters for the BLAST searches seem reasonable, but this section also states that the tree view was used to generate “phylogeny trees.” This demonstrates a clear lack of familiarity with the programs that are considered suitable for phylogenetic analysis.

Supplementary Sequence Information

I was initially somewhat excited about the raw sequence data. Having sequences in GenBank would have been better, and I’m not convinced that GenBank personnel wouldn’t have worked something out to get the sequences submitted. Needless to say, I was disappointed at how little sequence was included. Supplementary File 3 includes two myosin sequences, seven TAP sequences, and seven amelogenin sequences. The other supplementary DNA sequences are for the contigs generated from Samples 26, 31, and 140. Considering the large number of samples and how many sequences were reportedly generated, the small number of sequences provided leads me to suspect that independent analysis would not suggest the sequence was from Sasquatch.

Examination of Novel Hair Samples

So, let’s look at the hair analysis that is presented in the paper, most of which is summed up in Figure 5. We have macroscopic pictures of several hairs from Sample 33, microscopic pictures of the medulla and internal structure of Sample 26, the spade-shaped root of Sample 18, and the cuticle pattern of Sample 26. As a non-expert, I’m not sure why these particular samples were chosen or why the same sample couldn’t be used to show everything. The authors state that, “Most of the submitted hairs were not microscopically consistent with any of the hairs from the reference collection of common animal hairs…” and I’m again left wondering how many hairs were identifiable and what they were. It really isn’t good enough to make that statement without presenting the data that supports that conclusion.

The last paragraph of the main text that addresses the hair samples describes which samples were considered appropriate for DNA testing. This paragraph again mentions that DNA testing was done on hairs that could not be identified, and we have the recurring lack of information regarding which samples were ruled out. This paragraph also implies that samples underwent either nuclear DNA testing or mitochondrial DNA testing, but results presented in the manuscript suggest that some samples underwent both tests. Again, a section of Table 1 that clearly indicates what tests were performed for each sample would clarify this.


Collection and Classification of Hominin Samples, Prevention of DNA Contamination by Forensic Methodologies

The main text goes on to discuss collection, classification, and measures taken to prevent contamination. Similar to hair analysis, I am not experienced in forensic science, but the measures they took seem reasonable to me. However, the possibility of contamination will probably always be in the back of my mind because samples were generally collected under less than ideal conditions. For example, Sample 8 was hair taken from a dumpster, Sample 9 was hair caught on chicken wire of a “peacock cage where peacocks were killed and plucked,” (thanks for that image, by the way), and there were numerous examples of hair caught on trees, fences and hair traps. Other than hair, Sample 35 was a large toenail found in a dry creekbed, Sample 36 was saliva from a mauled camera, and Samples 137 and 140 were blood samples from a cage where a rabbit was killed and a damaged downspout, respectively. The potential for pre-collection contamination seems high, but I would defer to the opinion of someone skilled in DNA analysis from these kinds of samples as to whether proper protocols were followed to rule out that kind of contamination. I will also leave technical issues with DNA isolation from hair samples, etc., to someone more qualified to address them. For the purposes of my review, I will assume appropriate DNA isolation protocols were used.

Determination of DNA Quality

The “Determination of DNA Quality” section includes a picture of extracted DNA run on an agarose gel (Figure 7). I have several problems with this figure. First, the only labels are the wells, and there is nothing explaining which sample numbers correlate with lanes 1-24. Second, I have no idea what percentage gel this is or what ladder they used. Third, the gel is not run out very far and it looks as though the DNA may still be in the wells. I understand the DNA will be large, but pictures I usually see have run the gel far enough to at least get the DNA out of the wells. In addition to the gel, I would have liked to see some quantitative information regarding yield. Quantitation by nanodrop was performed for at least some samples, but I didn’t see anything presented regarding typical yields from the DNA extractions.

The next paragraph discusses histology of tissue found for Sample 26. Again, no expertise in this area and I am accepting the conclusion that “the histology was deemed inconsistent with human skin.” However, I would like to see the authors address a couple of questions: (1) Did the histological findings suggest another species? and (2) Are there conditions under which human tissue might present as seen here? Table 1 provides little information about the specific circumstances under which this sample was obtained, and it was received anonymously by the person who submitted it (understandable since whoever it is may be the person who shot it). This means there is very little known about the conditions it has been exposed to prior to being received for the study.

Screening of the Hominin DNA Samples

This section describes preliminary screening of the samples, and most of the results are discussed in greater detail later. The mitochondrial DNA results were 100% human, while many samples gave incomplete PowerPlex® 16 profiles. The primary conclusion from the screening was that mitochondrial, but not nuclear, DNA was consistent with human. However, I think the conclusion that the nuclear DNA “did not conform to human DNA” is not well supported, and will elaborate more in subsequent sections.

Testing and Results

Mitochondrial complete genome and HV region sequencing results

The next data presented are mitochondrial DNA haplotypes (Table 2 and Supplemental Data 2). This seems pretty straightforward, but haplotypes are given for Samples 4 and 71, which are not found in Table 1, and data for these samples are presented in Supplemental Data 2. This inconsistency needs to be addressed since data shouldn’t be presented for samples that don’t exist in the study. In Table 2, Sample 18 is labeled with an asterisk, but I couldn’t figure out what this meant. There needs to be a footnote explaining this notation. In Supplemental Data 2, it isn’t clear if the changes are relative to the Cambridge Reference Sequence (CRS) or to the haplotype given. It also isn’t clear why some changes are listed in red font. The authors should clarify these points so the reader isn’t left guessing.

The mtDNA data suggest the samples are human in origin, but the authors need to revise the statement, “no mitochondrial DNA homology with apes, Neanderthal or Denisova cave sequences were found.” What they intend to say is that none of the samples showed greater similarity to mtDNA from these other species, i.e., all the sequences obtained from “Sasquatch” samples were most similar to other human sequences. The current wording, however, unintentionally implies that human mtDNA has no homology with that of apes, Neanderthals, or Denisova hominins, which clearly doesn’t make sense. The authors need to revise this to avoid giving the impression that they have no understanding of evolution whatsoever.

The authors identified 16 mitochondrial haplotypes from the samples (predominantly European and Middle Eastern) and use this to argue these hominins may have migrated before the migration across the Bering land bridge. I don’t know much about the Solutrean Theory and the evidence for or against it, but I’m not convinced early migration is the best explanation for the diversity of mtDNA haplotypes. As I understand the situation, our data thus far is conflicting: hair that seems non-human and completely human mtDNA haplotypes. My inclination would be to trust the DNA evidence and see if there is an explanation for why human hair might not look like human hair.

Mitochondrial DNA sequences from three samples (26, 31, and 140) were used to generate phylogenetic trees. I have seen some less than stellar trees before, but these are some of the worst I have seen. The font is so small the labels aren’t legible. There is no information provided about the other sequences used and how they were chosen, the method used to generate the tree (distance) and accompanying parameters, or measures of support for any of the branches presented on the figure. Based on what I can see by zooming in, this is a distance tree that includes only human mtDNA sequences and may be rooted with the sequence from the study sample. Without an outgroup, this tree doesn’t show that the sample sequence is more related to human mtDNA sequences than to mtDNA from other species, only which of the various human sequences are the closest matches. These trees were supposedly generated with sequences from FamilyTree DNA, a company that specializes in human DNA testing for ancestry/genealogical purposes, which explains why there are no non-human sequences. In general, people aren’t concerned with establishing that they are more related to humans than other primates (there may be exceptions), so a tree generated relative to the submitted sample makes more sense because the question is different. For ancestry purposes, you are looking to see what sequences in the database are the closest matches to your sequence of interest (the one submitted by the customer who paid somewhere between $49 and $837) rather than looking to see where the submitted sample fits into an already established phylogeny (such as how a potential Sasquatch sequence is related to primate sequences). Phylogenetic trees have to be generated with attention to the question they are addressing. Failure to do this often results in inappropriate choices regarding the sequences, alignment parameters, and the method of tree generation with its associated parameters. Ultimately, this can lead to generation of a tree that doesn’t mean anything in the context of the study, as seen here. To show that the sample sequences are more related to human mtDNA than to mtDNA from other species, at least one non-human mtDNA sequence needs to be included in each tree. Ideally, several other primate sequences would be included. The BLAST results alone are probably sufficient to establish that the mtDNA is human, so the authors should consider whether the trees are even necessary. If they choose to retain the trees, they need to be redone to be relevant to this study.

Nuclear DNA Analysis


double helixI wasn’t familiar with amelogenin and how it tends to be used, but I don’t have to be to see some problems with how the data from this are presented. As I’ve stated before and will keep coming back to, there should be somewhere (Table 1 is a good place, but another table could be generated) that clearly states what tests were performed on each sample. Table 3 presents the results of amelogenin STR analysis. There is again a problem with having data here for samples that are not found in Table 1 (Samples 4, 5, 7, 13, 19, and 30). There doesn’t seem to be any information for Samples 39-129 and 131-168. I initially assumed that these excluded samples simply did not provide enough DNA for analysis, but then I looked at Table 4, which includes many Samples not found in Table 3. The authors should clarify how samples were chosen for each test. Table 4 also includes data for samples 71 and 72, which are not found in Table 1, and there are two sets of results shown for Sample 106. By my count, some amelogenin sequence data should be available for 37 samples, and should be available for 12 if we limit it to just those that gave sequence annotated as “Unknown,” but S3 includes only seven: AmelX for Sample 26, AmelX and AmelY for Sample 35, and two Amel sequences each for Samples 43 and 44 (these are not labeled X or Y). I find this somewhat confusing since Table 4 shows that the PCR or sequencing failed for AmelX for all four of these samples. Where did these sequences come from?

I conducted a fairly minimal search using the Google machine and found a paper on validation of the PowerPlex® 16 BIO, which is a kit optimized for different analysis systems but uses the same primers as the PowerPlex® 16 kit. This paper mentions cross-species hybridization of DNA from some primates with the amelogenin primers, and that PCR products from some mammals migrate faster than the human amelogenin X allele (Greenspoon et al., 2004). While not conclusive, this provides some possible explanations for the observed amelogenin results.

Supporters of this work will claim that I’m only pointing out minor organizational details, while the fact that the sequences found don’t match anything in the databases is the important point. First, if I was peer-reviewing this manuscript, inconsistencies in data presentation are certainly something I would point out so that the authors could correct errors prior to publication. There shouldn’t be any ambiguity about what tests were performed for each sample and what the results are. Second, just because a sequence doesn’t match what’s in the databases does not mean it is a novel sequence that has any value. There are a number of PCR and sequencing artifacts that could lead to this type of problem, and even when measures are taken to rule some of these out, it is more likely that something went wrong during the process than the sequence being sequence that can now be called “Sasquatch” sequence. With sequence that doesn’t match anything, you can really only draw conclusions about what it isn’t, not about what it is.

Promega PowerPlex® 16

Table 5 shows results from the PowerPlex® 16 kit, which is designed for amplifying short tandem repeats simultaneously from 16 loci of human DNA. I wasn’t familiar with this kit, but scrolling through a manual I found, again thanks to Google, I noticed a sentence marked with a big “!” that said, “Failure to vortex the PCR amplification mix sufficiently can result in poor amplification or locus-to-locus imbalance.” Hmmm….the authors state, “All samples that yielded results for PowerPlex® 16 gave only partial profiles with random dropout of alleles, off ladder alleles, and/or allele frequencies for some markers inconsistent with those found in the human population.” This certainly provides a potential explanation, and although there were controls, the study samples were probably not in ideal condition to begin with, which could have made them more susceptible to something like less than complete mixing. In addition, validation of the PowerPlex16 and PowerPlex16 BIO kits found that insufficient DNA and DNA from samples exposed to higher temperatures (80°C for PowerPlex 16, 56 and 80°C for PowerPlex16 BIO) for prolonged periods of time often gave only partial profiles (Krenke, et al., 2002; Greenspoon et al., 2004).

It wasn’t entirely clear, but it seemed that Table 3 was presenting PowerPlex® 16 amelogenin results, which means the amelogenin results from Table 3 and Table 5 should match, right? Or not. Table 3 includes 33 samples, with 12 failing to give a result. Table 5 has 13 samples, with Sample 3 listed twice (with different results, the second one is probably intended to be Sample 33), 6 samples showing different results between the two tables (Samples 7, 10, 23, 30, 34, and 130), and one sample found on Table 5 that was not included in Table 3 (Sample 25 may be intended to be Sample 26). The rest of the table reports alleles found at the loci tested, which as the authors state, are partial profiles for all samples. As to the allele frequencies being inconsistent with those found in the human population, I don’t think the small sample size really always us to draw that conclusion. Some of the alleles in the control reactions are rare alleles (including allele 10 at the D13S317 locus that is marked as being inconsistent with human populations), so I don’t think much weight should be given to this. I also don’t agree with the authors reporting “PP16 Human Frequencies” that are actually frequencies for Caucasian- Americans. Promega lists frequencies for Caucasian-American, African-American, Hispanic- American, and Asian-American populations ( identity/population-statistics/allele-frequencies/#D1S1656), but the authors chose to report the frequency listed for Caucasian-Americans as the “human frequency.” Apparently, Sasquatch aren’t just part human, they’re assumed to be of Caucasian descent. This seems narrow-minded.

Melanocortin 1 Receptor Gene (MC1R)

I was somewhat confused by the presentation of data for the melanocortin 1 receptor (MC1R) gene. The authors state, “Samples 28, 33, 35, and 37 had sufficient DNA extracted and were chosen for MC1R locus sequencing.” This implies that this locus was tested for only four samples, but the next paragraph has, “Samples 28, 35, and others were then sent to SeqWright to have the sequences confirmed with the design of new MC1R primers.” Table 6 includes thirty samples, 19 of which sequenced well enough to give genotype information at several nucleotide positions of the MC1R gene, one sample that gave sequence corresponding to human chromosome 5, and ten samples that resulted in eight unique, unknown sequences. As with previous tables, it includes some samples that are not shown in Table 1 (Sample 4 and Sample 39b). I expected the eight “unknown” sequences to be included in the supplementary file containing raw sequence data, but no MC1R sequences are included in this file. The authors should include these sequences in the supplementary data. It also isn’t clear how polymorphic this locus tends to be in the human population and the extent to which human polymorphisms have been studied, so the context for the findings in this study is not presented clearly.

Myosin 16 Heavy Chain (MYH16)

The authors also sequenced the heavy chain of myosin 16, and found that “all samples that successfully amplified yielded results consistent with human and aligned with the human reference sequence,” but we are given no information about which samples amplified successfully and which failed. Only two myosin sequences are included in the supplementary file (Samples 35 and 37). The authors need to clarify which samples were sequenced successfully and which were included in the “small number of samples” that SeqWright reported Exon 18 sequence for. I also recommend that the authors clarify their use of the word “ape.” Strictly speaking, I’m pretty sure humans are considered apes, so the species used for designing the “ape” primers should be clarified. It is possible the authors intended to convey that primers had been designed that would amplify myosin from any ape, including humans, but this was not at all clear.

Antigen peptide transporter (TAP1)

There are TAP1 sequences from seven samples included in Supplementary Data S3 (Samples 10, 26, 33, 35, 39b, 43, and 44). BLAST searches identify Samples 26, 35, and 39b as human TAP1 (yes, I verified that myself). According to the manuscript, Samples 33 and 44 aligned with each other, and I was able to align the reverse complement of the sequence from Sample 44 with the five-prime end of sequence from Sample 33. This sequence didn’t have any similarity to sequences in the GenBank nr or est databases, although the first 21 nucleotides corresponded to TAP1 from human and other primates and may represent the primer. The three- prime end of the sequence from Sample 33 aligns with human mitochondrial sequences. The manuscript also states that the sequences from Samples 10 and 43 align, but I couldn’t make this happen. The last 18 nucleotides of the sequence from Sample 43 align with TAP1, so again, it’s possible this is primer sequence. Although the last 65 nucleotides of the sequence from Sample 10 have no similarity to database sequences, a BLAST search showed that the first ~180 nucleotides matched sequences from dog. These searches were done using default settings of the megablast blastn program, so the authors should have had no trouble determining the sequence was likely from dog.

Table 7 compares selected results from four of the samples, and there are issues with consistency and a lack of clarity. First, the table includes a comparison of Exon 3 from amelogenin with no explanation as to why this exon was chosen, and reports “unknown” sequences that align for 43 and 44 even though Table 4 shows that either the PCR or sequencing failed for these two samples. Results from Sample 10 are not included in Table 4 or in Table 2, and should be added for consistency and completeness of the results.

Whole Human Genome SNP analysis

Twenty-four samples were sent for SNP analysis, but which samples were sent is not specified. The authors report poor performance of the samples, but don’t give much information about the starting material, such as concentration of the samples. I appreciate the effort to include intentionally degraded DNA, although I’m not certain the laboratory conditions sufficiently reproduced the environmental and storage conditions that many of the samples underwent. In particular, exposure to sunlight was not included in the intentional degradation protocol. Figure 10 shows a gel with four samples in addition to the degraded human DNA control, and the text mentions that most of the samples showed no evidence of degradation. If we’re to judge by the smearing in the control lane (as the text suggests), Samples 140 and 33 don’t look ideal to me, and since this is 50% of the samples shown on the gel, a gel with all the samples might be more convincing. Based on the degraded DNA control, the authors conclude that even very degraded DNA should show >97% SNP performance. Since none of the samples performed at this level, they seem to conclude the DNA in the samples varies significantly from human. Since at least one sample is likely to be from dog (Sample 10), it would be interesting to run controls from species most likely to be mistaken for Sasquatch to see how they perform in the human SNP analysis. I would also be interested in the opinion of someone who specializes in testing forensic samples. It’s possible that this level of performance is typical of samples collected under conditions similar to the one in this study.

Potential DNA Sequence Anomalies

The goal of this section seems to be to show that some samples gave unexpected bands when amplified with the primers for the various loci that were sequenced. This section seems a little misplaced since I would have thought gels of the PCR products would have been run before attempting to sequence. I’m not certain that a picture of a gel is necessary here. It might be better to add a statement regarding which samples gave PCR products of the predicted size and which did not. The gel shown in Figure 11 is for the MC1R locus, and similar to the previous gel picture, is not adequately labeled. It is also somewhat confusing because there are samples for which data are shown in Table 6 that didn’t seem to amplify in Figure 11. The authors should clarify the relationship between the data presented in Table 6 and the bands shown in Figure 11.

Electron Microscopy

I don’t know anything about the use of electron microscopy for characterizing DNA, and while maybe someone got some useful information from this section, I didn’t. The figure accompanying this section (Figure 12) includes potential Sasquatch DNA and degraded human DNA, but doesn’t present a human high-quality DNA control. As far as I know, we would not expect to see any differences at this level between human DNA and DNA from other species, but this is not stated.

Next Generation Whole Genome Sequencing

This is the part highlighted in the title, which needs to be revised since so little is actually presented. I have not personally published a draft genome of anything, but I’ve read several, and I’m pretty sure this isn’t how you’re supposed to do it. It’s possible that there is raw sequence available for three “Sasquatch” genomes, but the genomes haven’t been sequenced and annotated in a meaningful way. I would expect to see statements of how many genes were found, the overall percent identity to the human genome, level of synteny, etc. Basically, does this look like a human genome?

Figures 13, 14, and 15 show the sources of the DNA for Samples 26, 31, and 140, respectively. Figure 13 shows Sample 26, which Table 1 lists as “Skin intact with subcutis and blonde/white with some black hair,” while the caption states that it is “tissue with hair, skin, subcutaneous tissues, and some muscle.” Although my first thought when looking at the picture is, “Ew!” my second thought is that there has to be enough there to be able to determine what species the sample came from. My third thought is that this sample was received anonymously by the submitter (from Sasquatch shooter), which makes this sample more likely to be a hoax since the person who collected it won’t be held accountable. Figure 14 shows Sample 31, which is a paper plate with a piece of sandpaper on it that was apparently used as a food trap. So, we’re getting DNA from what? Saliva? Seems like a long shot. Figure 15 shows a downspout that was apparently chewed by… something. The figure shows blood and plaque/tartar from teeth marks. I would’ve thought the teeth marks might have given some insight into the identity of whatever it is that likes to chew downspouts, but I don’t think this was addressed. Although seeing the sources of the DNA is nice, I don’t find any of it very convincing.

The authors state that they have high-quality full genome sequence (30X coverage) for each of three samples (Samples 26, 31, and 140) with Q30 scores greater than 88, indicating the DNA for each sample was from a single source. Ok, that’s good news as far as I can tell. Looking at the information in Supplementary Data 7, I would like to see some discussion of the “% align” column, especially since I may be interpreting it wrong. It seems this column would be saying that between 0.07 and 2.16% of the clusters that passed the filtering step aligned with the reference genome (hg19). If this is a correct interpretation, it seems like this would argue for the sequence having very little human DNA, and it might not even be appropriate to attempt to align it with the human genome.

Unlike other draft genome reports I have read, this study gave no general characterization of the entire “Sasquatch” genomes and instead focused on chromosome 11. Why? Because chromosome 11 is the most awesome of the humans chromosomes, obviously. No, I’m sure there is some biologically relevant reason to choose this chromosome. Let’s see…Wikipedia says that more than 40% of our olfactory receptor genes are on this chromosome, so I guess this would help test some hypotheses about Sasquatch having a better sense of smell than a human, if such hypotheses exist. What else…oh, there’s porphyria, but I think that’s hypothesized to be associated with vampirism, so not really relevant to this study. Regardless of the potential reason, the authors do not provide a rationale for looking only at chromosome 11, and should have anticipated resentment at claiming to report three genomes if they only looked closely at one chromosome.

Speaking of chromosomes, this paper doesn’t address is the diploid number of Sasquatch chromosomes. I don’t know for sure, but shouldn’t a karyotype have been possible from some of the hair follicles? Or at least from Sample 26 since there was tissue? Maybe it’s just me, but if you’re trying to establish a new biological species, wouldn’t you try to figure out how many chromosomes it has? Just saying.

Back to the analysis of the sequences that seem to align with chromosome 11…the authors generate another set of phylogenetic trees that have the same problems as the ones I mentioned before. These trees were generated in BLAST, which can be used as a first look at the relationships between the query sequences and the sequences brought up in the search, but is not generally used for generating trees that you want to rely on to say, provide evidence to support the existence of a new species. Figure 16 suggests the unknown sequence is most closely related to sequences from Homo sapiens (presumably human, but that isn’t specified) and Otolemur garnettii (which is actually a galago and not a lemur). Especially since this doesn’t make biological sense without generating some questionable scenarios, I wouldn’t read much of anything into the trees presented in this paper. All phylogenetic analyses need to be redone by someone with expertise in this area.



The first paragraph mentions again the number and type of samples collected, and refers to “a continent-wide team of dedicated collectors.” This implies that collectors were contacted and asked to find and submit new samples for this study, although it isn’t clear whether they submitted previously collected samples or obtained new ones. Figure 3 showing the hair trap and a close-up of Sample 168 has a date on it of 11/26/2011, but the sample was submitted in June 2012 (Table 1). This means the sample was stored for approximately six months prior to submission. The authors need to clarify if collection, handling, and storage guidelines were provided to the collectors, and Table 1 should include collection dates in addition to submission dates, as well as notes regarding collection, handling, and storage conditions.

The second paragraph discusses the screening of hair samples to eliminate hair consistent with human and known wildlife species. I’ll reiterate here that it would benefit the study and increase credibility if Table 1 included all samples submitted and clearly noted those that were excluded because they were determined to belong to a dog, bear, panda, etc. There also needs to be data presented on the hair analysis for each sample. Each sample collected is from a potential Sasquatch and therefore, the samples cannot be combined and treated as though they are from the same individual to establish the existence of Sasquatch. I should have some way to look at any sample number and determine what tests were performed and what the results were, without combing through all the other tables to see whether or not that sample number is included. I still would like to see an expert in hair analysis weigh in on the results and address whether there are environmental (or lab) conditions that might alter human and/or wildlife hair in ways consistent with what was observed in the study. I am currently unconvinced that there isn’t an alternative explanation for the hair analysis results.

The third paragraph reiterates the precautions taken to prevent contamination of the samples by human DNA from collectors, researchers, etc., which I’m actually not as concerned about as the possibility of contamination by a wide variety of DNA during the time that it was, say, stuck to a tree in the wilderness. As far as I can tell, the sequencing methods are designed to give information about human DNA, and it seems there might be some additional tests that would be appropriate for analyzing DNA from an unknown source.

The fourth and fifth paragraphs acknowledge that this study, as well as others, identified human mtDNA haplotypes from “Sasquatch” samples. The authors state that “the possibility of a human male/progenitor female mating cannot be excluded without testing larger numbers of samples.” While this is technically true, I don’t think we can rule out the possibility that the DNA being tested is human DNA.

The sixth paragraph describes some other previous results, including those of a “well- known individual, Zana, and her hybrid son, Khwit.” These samples and one from Siberia all showed human mtDNA haplotypes. This paragraph is not entirely forthcoming about the circumstances of the testing for Zana and Khwit. A rather detailed account of Zana and Khwit can be found in an article online ( Yes, I checked PubMed. I didn’t find anything. Zana is supposed to have died in the 1880s or 1890s, and Khwit’s death is listed as 1954. The work carried out at New York University is apparently included in an episode of the National Geographic Channel’s “Is it Real?” entitled “Russian Bigfoot” (Season 2, Episode 13), which aired Nov. 20, 2006. The DNA samples came from skulls that supposedly belonged to Zana and Khwit, and the mtDNA results were consistent with Zana being Khwit’s mother. While the story passed on about Zana describes her as an “Almas” (wildwoman) that was captured and kept in an isolated village in the Caucasus mountains, there isn’t any real evidence that she wasn’t completely human. Seasoned Bigfoot/Sasquatch researchers were probably already familiar with this story, and I’ll admit to not doing a lot of reading on it while writing this review, but rather than Zana being another species, I think the mtDNA results support the hypothesis that she was human, and her children were then 100% human and not hybrids.

The majority of the seventh paragraph is devoted to stating that some samples were tested by other labs and the mtDNA results were consistent with those in this study, i.e., human haplotypes. The authors do need to address a couple of inconsistencies. They claim that results of mtDNA testing for Sample 134 were consistent between the studies, but I didn’t see results given for Sample 134. The authors may be referring to amplification of human cytochrome b sequence from this sample, but this should be clarified since the unpublished paper doesn’t seem to address haplotypes. This unpublished manuscript from Helix Biological Laboratory is referenced as “Supplementary Data 8,” but I think it should be “Supplementary Data 11.”

The next few paragraphs summarize the findings from various DNA analyses, and while I have already mentioned issues I have with interpretation of the results, some are worth mentioning again. First, he authors mention that sample DNA showed novel features not found in “control” samples. However, I am not convinced that the control samples that were intended to show results from degraded DNA were sufficient. The “Whole Human Genome SNP analysis” mentions that samples were “purposely maintained at room temperature in a moist environment for 4 days.” Samples used for validation of the PowerPlex® 16 and PowerPlex® 16 BIO were left for up to three months at 80°C, and often gave only partial profiles. I’m not convinced the “control” samples in this study were comparable to the samples collected and submitted for the project and consequently, I don’t think the authors can convincingly rule out degradation as an explanation for some of the results. The authors also need to be careful when interpreting results from BLAST searches that indicate “no similarity” to sequences in the databases. This doesn’t necessarily mean that the DNA represents sequence from a new hominin. It may be an artifact resulting from any one of the numerous steps in the process. I don’t have the expertise to make much comment on the electron microscopy of the DNA. The reporting of full Sasquatch genomes is significantly underdeveloped and the rationale for choosing to look exclusively at chromosome 11 is not clear. The finding that the DNA seems to contain conserved human genes supports the hypothesis that the DNA came from human origin. The phylogenetic analyses need to be redone, and I wouldn’t draw any conclusions from the trees presented in the manuscript.


I suspect an expert on hair analysis might be able to determine the source of the hair samples and/or provide scenarios under which human or wildlife hair might present with the characteristics observed in this study. The mtDNA was human, but nuclear DNA had “novel structure and sequence.” Any sequence that could not be identified as being similar to something in the databases could warrant further investigation, but in and of itself is not evidence for a new hominin. The “novel mosaic pattern” is probably an artifact, and not a clear indication of hybridization. In the “Breaking Bio” episode discussing this paper (, David Winter points out that the size of the pieces are too small to be consistent with the hypothesis of hybridization between a human female and…something. The authors mention an alternative hypothesis that involves Sasquatch being “human in origin, having been isolated in closed breeding populations for thousands of years.” I don’t know whether the number and distribution of mtDNA haplotypes is consistent with that or not. With the number of males included in the study, Y-chromosome typing might provide some insight. The methods states this was done, but no Y-STR data was presented. This section concludes with, “Nevertheless, the data conclusively proves that the Sasquatch exists as an extant hominin and are a direct maternal descendent of modern humans.”

Um, no. The data do no such thing.

Basically, this study shows that analysis of samples (mostly hair samples) collected under field conditions presents challenges that prevent definitive identification. I don’t know for certain, but I’m guessing a known human sample might show similar results if it had been hanging out in the woods getting exposed to UV light and large temperature fluctuations for any length of time. Here’s the most charitable that I can be: The data do not exclude the possibility that some collected samples came from a previously unidentified species.


I think this section could be a little more specific about exactly what analysis is being carried out on the Sasquatch genomes, especially since the data provided here was pretty minimal. This section mentions testing of hair from a “Siberian Wildman,” but it isn’t clear if the testing is being carried out by this group or by someone else. It’s good to know that a species name has been applied for, although it probably would have been better if they had done that earlier.

I hope the above demonstrates that the manuscript requires significant revision before it would be acceptable for publication. The most significant revisions include a clear presentation of test results from each sample (this can be included in supplementary files), correction of inconsistencies between the tables, new phylogenetic analyses using appropriate methods, and more conservative interpretation of results so that the conclusions are not overstated.


Greenspan, SA et al., 2004. Validation and implementation of the PowerPlex® 16 BIO system STR multiplex for forensic casework. J Forensic Sci. 49(1): 71-80.

Krenke, BE et al., 2002. Validation of a 16-locus fluorescent multiplex system. J Forensic Sci. 47(4):773-785.

One Response to “Catching Up With Dr. Ketchum’s Bigfoot Research”
  1. jayvay says:

    Reblogged this on JayVay and commented:

    We have another doctor in the house! Who knew the way to rope scientists into paranormal research was to wait until they becoame tenured professors? Let’s hope Dr. Tarr provides us with more of these helpful gems in the months to come….

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s