WebAIM - Web Accessibility In Mind

E-mail List Archives

PDF and searchable text for scanned documents

for

From: Jackson, Derek J
Date: Sep 29, 2020 8:23AM


Hello,

I have a remediated scanned document and it passes Adobe's Accessibility Check and PAC3. However the underlying text does not correspond to the visible text. For example the content container for a paragraph contains text like " =X's6- H -R, $E F I A*'a" that corresponds to an area on the PDF that is unrelated to the paragraph. However all of the paragraph tags use the "Actual Text" field to provide the actual text of the paragraph. The consequence is that a screen reader will read the paragraph correctly but the document is not searchable, and copy and paste is not practical. So I am wondering if this is an instance where we have a document that meets the accessibility requirements but still it is not functionally accessible or is there something in PDF/UA that addresses this issue? I have looked through the PDF/UA spec and am not seeing anything but I readily admit that some of the technical jargon and details are beyond me.

Thanks for the continued help!
Derek

—

Derek Jackson

Digital Accessibility Developer | Digital Accessibility Services
Harvard University Information Technology
1430 Massachusetts Ave, 4th Floor
Cambridge, MA 02138

he/him/his