E-mail List Archives
Re: Fixing OCR issues in PDF with Adobe Acrobat Pro
From: Karen McCall
Date: May 15, 2021 7:07AM
- Next message: Jennison Mark Asuncion: "the 10th GAAD is May 20"
- Previous message: Philip Kiff: "Re: Fixing OCR issues in PDF with Adobe Acrobat Pro"
- Next message in Thread: Philip Kiff: "Re: Fixing OCR issues in PDF with Adobe Acrobat Pro"
- Previous message in Thread: Philip Kiff: "Re: Fixing OCR issues in PDF with Adobe Acrobat Pro"
- View all messages in this Thread
You might be able to use the Edit PDF tools IF you haven't tagged the document yet. If you have, using this tool will destroy all tags either on that page or in the document...and you have to know what is wrong in the text before you can fix it. The Edit PDF capability may show you the correct spelling and spacing but the underlying OCR is wrong. I never recommend using this tool but offer it as an option if you can get it to do what you want it to do.
The problem using Actual Text for large pieces of content is that the Text-to-Speech tools have to use a different reading mode for images of text and sometimes lose the ability to follow along with highlighting. Same with something like ZoomText Fusion...you lose the ability of JAWS to highlight where you are reading. I recommend against using the Actual Text attribute for large pieces of text and entire documents.
I use ABBYY FineReader for any PDF document that I need to OCR. The latest version even has the capabilities to add form controls to the PDF (you have to do some remediation in the Tags Tree in Acrobat but these are minor). Others use OmniPage Pro and either can be purchased on sale for a reasonable price, not a subscription.
FineReader has two ways of dealing with scanned document:
As soon as you open a scanned document the OCR is done and you can resave the document as a searchable PDF without looking at any suspects or issues of spacing between words and characters. I use this when I want to just read a PDF that isn't tagged or is a scan because I can also send the document to Word.
The other tool in FineReader (and OmniPage Pro) is the ability to create an OCR project, open the PDF and access their text editor. I can use JAWS in the text editor so I can hear when words aren't correct or if there are no spaces between words or if there are spaces between characters. There is a sort of Styles pane where you can add structure, text, images and tables are identified in the document, and I have the ability to find an replace optional hyphens. I had a really horrible scan of a book with handwritten notes, doodles and diagrams in the margins and around the text and within a few hours had a readable PDF document with my screen reader.
I never use the Acrobat OCR for the reason mentioned...I can't rely on it telling "the truth" about what it found and what it missed. I ended up spending time remediating the scanned PDF only to find that words were wrong, some paragraphs had no spaces between words and others had spaces between characters in words. I save time by using one of the stand-alone OCR tools.
Cheers, Karen
- Next message: Jennison Mark Asuncion: "the 10th GAAD is May 20"
- Previous message: Philip Kiff: "Re: Fixing OCR issues in PDF with Adobe Acrobat Pro"
- Next message in Thread: Philip Kiff: "Re: Fixing OCR issues in PDF with Adobe Acrobat Pro"
- Previous message in Thread: Philip Kiff: "Re: Fixing OCR issues in PDF with Adobe Acrobat Pro"
- View all messages in this Thread