WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Semantics for Indicating Accessible Version of Files


From: Cliff Tyllick
Date: Jan 28, 2009 10:15AM

Yeah, but if I already have an accessible version of the original electronic document that was used to print the copies that were signed, why can't I just scan the signed copy (or even just its signature page) and marry that image with the original text layer?

Then there would be no concerns with the accuracy of OCR. I could just tag the signatures and any other new marks with appropriate alt text.

In other words, perhaps an option under OCR could be "Copy text layer from..."

Or the main function could be "Add text layer..." and the two options beneath it could be "Run OCR" and "Copy from..."

It sounds like you're saying there wouldn't be a significant barrier to making this possible. If it were possible, then the abilities of the software would meet the needs of the workplace. No one would have to review the scanned + OCR version of a 150-page contract to ensure that the OCR hadn't mistaken a one for an "ell" or a zero for an "oh" anywhere within it.

>>> "Moore, Michael" < <EMAIL REMOVED> > 1/28/2009 10:33 AM >>>
You know what would be a really great answer? To be able to marry the
image layer of one PDF with the text layer of another. That way the
accessible text layer could be added to the signed image layer, and we
could post just one file. But I'm not a programmer, so I have no idea
what it would entail to make that possible.



The scenario that you describe is essentially what happens when you use
Adobe professional to create an accessible document from a scanned
version. You can run OCR over the original document and generate a
tagged pdf. It is usually necessary to perform a bit of clean up on the
converted document because the OCR is not 100% accurate. Images,
signatures and other items will also need to be tagged as graphics or
artifacts. If tagged as a graphic they will require alternative text.
Artifacts act like background images in html.