WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: PDF and searchable text for scanned documents

for

From: Jackson, Derek J
Date: Sep 29, 2020 10:20AM


Hi Steve,

Thanks! I am glad I am not alone. The document was originally just a scanned document from a Canon scanner and then saved as a PDF. We asked someone to remediate the PDF and we got this results back but I don't know what tool they used. It seems like using any decent OCR tool would produce a better result. I am not relying on the Acrobat's Accessibility Checker either but I was a little surprised that it did not catch this, and the same for PAC3. That is what led me to wonder if this is actually out of conformance with PDF/UA (?) or if this is an example of a document that conforms to the guidelines but demonstrates and instance where the guidelines might fall short? Could someone say this type of document adheres to PDF/UA guidelines?

Best!
Derek

—


On 9/29/20, 11:31 AM, "WebAIM-Forum on behalf of Steve Green" < <EMAIL REMOVED> on behalf of <EMAIL REMOVED> > wrote:

I have encountered this several times, but I do not know what causes it. We use the axesPDF QuickFix tool to view and modify the mapping between the glyphs and the underlying Unicode characters, but we usually only need to fix one or two incorrect mappings. I guess you could go through all the mappings for all the fonts and replace the Unicode characters with the ones you want, but that sounds like a lot of work. There may be other ways to do it more efficiently.

Remember that Acrobat's Accessibility Check is only doing a very small number of very simple tests. Passing the test tells you almost nothing about the document's accessibility, other than it is probably not as terrible as it might have been.

What application was the document authored in?

Steve Green
Managing Director
Test Partners Ltd