Home > Collections > Historical Newspaper & Photo Archives > Beating OCR Errors: How to Search Old Newspapers When Names Are Misspelled
Home > Collections > Historical Newspaper & Photo Archives > Beating OCR Errors: How to Search Old Newspapers When Names Are Misspelled
Master wildcard operators, address-based searching, and keyword proximity to find ancestors hidden by bad microfilm scans and optical character recognition errors.
Digitized historical newspapers are one of the greatest tools in genealogy, but they have a massive technical flaw: OCR (Optical Character Recognition).
When billions of old newspaper pages were digitized, humans did not type out the text. Instead, computers scanned old, scratched microfilms of newspapers that were originally printed with worn-out ink presses or wrinkled paper. The computer algorithm tries its best to "read" the blurry shapes and convert them into searchable text, but it frequently fails.
For example, a computer scanning a faded 19th-century newspaper might read the name "Smith" as "Srnith," "8mith," or even "Sntith." If you type "Smith" into the search bar, the search engine will confidently tell you there are zero results—even if your ancestor's name is printed right on the front page.
To break through this digital brick wall, you have to stop searching like a normal user and start thinking like a database expert. Here are four strategies to outsmart OCR errors.
If a specific letter in your ancestor's name is constantly misread by scanners, remove that letter from the equation entirely.
The Strategy: Use a wildcard symbol—usually an asterisk (*) or a question mark (?)—to replace the garbled letters in a search bar.
The Breakthrough: If your ancestor’s last name was "Schumacher," the "u" and the "a" might be smudged in the original ink. Instead of searching the full name, search Schmcher. This command tells the database to find any word that starts with "Sch," has an "m" in the middle, and ends with "cher." It will pull up the exact article, bypassing the OCR errors completely.
When a name is hopelessly garbled by a bad microfilm scan, target numbers instead. OCR software is generally much better at reading digits than it is at reading complex, blurry serif fonts.
The Strategy: Pull your ancestor's exact street address from our Census & Population Collections and search the newspaper for the house number and street name in quotation marks (e.g., "405 Elm").
The Breakthrough: Local newspapers frequently published the addresses of residents hosting parties, recovering from illnesses, or hosting out-of-town guests. Finding a society column that reads, "A gathering was held at 405 Elm Street," will lead you right to your ancestor's story, even if the computer completely butchered their surname.
Sometimes, an OCR error splits a first and last name apart, or inserts random punctuation between them. A standard "Exact Phrase" search will miss these completely.
The Strategy: Use proximity operators to search for a first name appearing near a spouse's name, rather than searching for their exact full names.
The Breakthrough: Instead of searching exactly for "John and Mary Smith," use advanced search settings to look for the word "John" within 5 to 10 words of "Mary." If the OCR misread their last name as "Srnith," the proximity search will still catch the two distinct first names huddled closely together in the same paragraph.
If your ancestor's name was printed on a crease in the newspaper, it might be permanently unreadable to the computer. You need to search for the people standing next to them.
The Strategy: Stop searching for your ancestor and start searching for the people they interacted with: the attending physician, the local pastor, the defense attorney, or a known business partner.
The Breakthrough: If you know your ancestor was married on a specific date, but their name is illegible in the wedding announcement, search for the name of the minister who performed the ceremony. Searching for the minister's name on that specific date can drop you directly onto the exact page you need, allowing you to read the announcement with your own eyes.
🔍
Ready to outsmart the algorithm?
Do not let bad microfilm or scanning errors erase your family's history. Dive into our massive database of historical newspapers, broadsides, and print media to uncover the lost headlines and local news stories about your ancestors.