WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Fixing PDF OCR Errors

for

From: chagnon@pubcom.com
Date: Aug 15, 2018 12:43PM


Adobe has some good tutorials on this.
http://blogs.adobe.com/acrolaw/2016/03/correcting-ocr-errors/

Keep in mind that you're working with two items:
1) the actual scanned image, which is a graphic of the text, and
2) Acrobat's interpretation of the graphical text as live, editable text.

You need to correct the second, Acrobat's interpretation of the graphical
text. Acrobat will show you possible candidates of what it wasn't sure were
correct interpretations, but it's a bit more work to correct what Acrobat
thinks is correct, such as zeros instead of capital letter O's.

-Bevi

- - -
Bevi Chagnon, founder/CEO | <EMAIL REMOVED>
- - -
PubCom: Technologists for Accessible Design + Publishing
consulting . training . development . design . sec. 508 services
Upcoming classes at www.PubCom.com/classes
- - -
Latest blog-newsletter - Accessibility Tips at www.PubCom.com/blog

-----Original Message-----
From: WebAIM-Forum < <EMAIL REMOVED> > On Behalf Of
Joseph Sherman
Sent: Wednesday, August 15, 2018 1:09 PM
To: 'WebAIM Discussion List' < <EMAIL REMOVED> >
Subject: [WebAIM] Fixing PDF OCR Errors

What's the easiest was to fix PDF OCR Errors? For example, I have a signed
one page legal memo that was scanned in after signed. I ran OCR and Acrobat
says there are no OCR suspects. But a couple of places the letter O was
recognized as the number 0.

I tried going into the tag tree and changing the actual and alt text for
that content, which seemed to work when I read it with JAWS. It there a
different way I should be doing this?


Joseph

http://webaim.org/discussion/archives