Evaluating Records, Sources, and Evidence


Elizabeth E. Bullard
July 25, 2017

Image of a Bee

This original article may not be reproduced in any manner without the express written consent of the author.


A lot of people think that genealogical research involves dropping a bunch of names into family tree software, loading the data online, and magically being able to show a direct line of descent from Odin (okay... maybe I'm exaggerating just a little bit). Genealogical research is much more than a family tree full of names, though. When you conduct genealogical research, your intention should be to seek information (by searching records) that answers questions about your people... who they were, where they lived, how they lived, what they did, and what legacy they might have left behind. The process of genealogical research seeks facts about specific life events in order to answer specific questions about specific people. The records that you search in pursuit of those facts are what you must evaluate and then cite properly in order to have as complete and accurate a genealogical record as possible. There are many factors to consider when evaluating records, sources, and evidence.

When you're seeking a specific record, I suppose that relevance comes in to play first. Let's say that I am looking for John Bullard in 18th century baptismal records for a particular church in the Boston area and I find half a dozen children by that name (or with similar names; I have to consider every reasonable variation, after all) within a span of a few years. How do I know which is my John Bullard, or whether any of them is my John Bullard? Relevance. I can't know if any of them is my John Bullard without additional supporting information. I need his birth date, his birth place, his parents' names. Without at least some of this additional information, I cannot decide which record is relevant to my research.

Category. Original records tend to be the more accurate of the two types of records, though we must remember that any record can contain errors. Generally, original records were written close to when an event occurred by someone with official and direct knowledge of the event. These records are the very first made of any event. An original record is information that did not come from data already spoken or written and, therefore, a derivative record is information that is repeated, reproduced, transcribed, abstracted, or summarized from something already spoken or written.

Another consideration is the format of the records that your searching. I believe that it is safe to say that most genealogists consider any method that accurately preserves the image of an original record as essentially the same thing as an original record. Transcribed, extracted, or abstracted records are another story altogether. In those cases, you have to remember that the data is only as good as the person that wrote it. Typographical errors are a regular problem and, often, the further removed the copy is from the actual document, the more errors there are likely to be.

Three quick definitions:
1. A transcription contains all information in an original document, copied word for word*.
2. An extract is made from very large or very long documents and, while everything that is copied is copied word for word*, there will be some information (sometimes quite a lot of information) on the original document that is not included in the extract.
3. An abstract almost none, if any, exact copying, but instead only describes what is contained in the original document.

*Even when something is being copied "word for word", errors can be made. It is possible that a transcriber cannot read handwriting, for example. "Word for word" also is not synonymous with "accurate". There were many perceptual and typographical problems with enumerators of census records, for example, because they often spelled things phonetically or simply outright misspelled their data.

Class. Primary information is provided by someone closely associated with an event, usually at or near the time of the event. When Susan records the birth of her child, she is a primary source of information on an original record. Secondary information is recorded later than the event or is recorded by a person who was not associated with the event. When Buddy, a census enumerator, arrives at Susan's house and records her age, he is recording secondary birth information on an original record. Susan is the secondary source of information because she didn't record her own birth. Yes, technically you were present for your own birth, but can you remember it and attest to witnessing that it happened? No. Therefore, you rely on the primary birth information provided by your mother and, when you repeat it later, you are providing secondary birth information.

Usually, primary information is found in original records. This is not always the case, however. When a physician or coroner records the death of a patient, he is a primary source of information on an original record... or is he? Okay, here's where things get squirrelly. Yes, the physician or coroner is a primary source on an original record... but only as far as what he knows or witnesses (the death, cause of death, disposition, etc.). That physician or coroner is going to have to rely on someone else to provide vital information for that death record to be completed and recorded. The reason that we so often find errors in names, birth dates/ages, birth places, etc. on death records is because an informant must be relied upon to provide information about the deceased and that informant often gets it very, very wrong. Generally, an informant is a secondary source of information on an original record, unless the informant is a parent. This issue often means that death records are far less reliable than other records.

Secondary information usually is found in derivative records. You may remember that I mentioned Buddy the census enumerator, who is diligently completing an original census record by compiling secondary information. When he speaks with Susan, however, he discovers that she has a child and she provides him with that child's age. Bam! Buddy now has recorded primary birth information because... you guessed it... Susan was an eyewitness at her child's birth. The problem with my little scenario here is that we researchers don't have any way to be certain about who gave what information to Buddy the census enumerator. Therefore, all census information is considered secondary source information and my example is not a good one. Hopefully, you see the distinction that I was trying to make, though. For obvious reasons, genealogists prefer primary sources, but don't be afraid of secondary sources. Mistakes and deliberately misleading statements are commonplace, even with primary sources, meaning that secondary sources are not always less accurate than primary sources. Trust nothing implicitly.

To sum it up, primary source information usually is found in original records, but it is possible that not all information in an original record is from a primary source. Most derivative records and many printed records contain secondary source information, but not all information on derivative and printed records is secondary. A death record can be both a primary source and a secondary source. Determining the identity of an informant might help you to evaluate the trustworthiness of the information provided on a death record.

Evidence. Each piece of evidence is evaluated individually. Proof is the accumulation of acceptable evidence. Clear and convincing evidence means that any reasonable person would make the same conclusion. There are two types of evidence to be evaluated. Direct evidence provides a fact that is straightforward. My birth record provides direct statements about when and where I was born and about my parentage, for example. Indirect evidence supports a fact by a preponderance of the evidence. Indirect evidence requires much more evidence in order to prove a fact (just like circumstantial evidence in a court of law). If the 1860 census lists Joe's age as 30, for example, we can infer that he was born in about 1830. If the 1870 census lists Joe's age as 40, we have further evidence that suggests that he was born in about 1830. Hopefully, we can put all of our indirect evidence together when we're done and say with confidence that Joe was born in about 1830, based upon a preponderance of evidence. The trouble comes when every record is different and there is no preponderance of evidence. It happens. Sometimes there simply is not sufficient evidence, no matter how hard we look. Some facts simply are lost to history. That's a difficult pill to swallow for a genealogical researcher, but it's the truth. We simply cannot know everything about everyone. As documents and monuments rapidly decay, we scramble to save what data we can from them, but it cannot all be saved. We look. We look again. We look twenty times more. Don't beat yourself up about it. Sometimes, the real evidence is just gone. In that case, we cite our sources and we make our best guess and we move on. It's the best that we can do.

So, there you have it. This is the basis of genealogical research and of evaluating records. Ideally, we will want to find original records with primary source information that provides direct evidence to prove facts. When we have those kinds of records, life is sweet. Unfortunately, those records often are not available. Other times, the records cannot be believed because they contradict every other known fact about an individual. In these instances, we must accumulate evidence by seeking out other sources and evaluating them. Perhaps instead we end up with a stack of derivative records with secondary source information that provides indirect evidence - all of which (cumulatively) point to a single fact. Or, more likely, we have some combination of category, class, and evidence. Yay! One fact down, 20 million to go. Isn't this fun?