WebAIM - Web Accessibility In Mind

E-mail List Archives

Does pdfGoHTML not recognize Actual Text?

for

From: Jonathan Metz
Date: Jul 18, 2013 8:46AM


I feel like an idiot for not having done pdfGoHTML sooner. At least that
way I’d know that the first page was somehow getting hidden from
everything! I ran it and it wasn’t picking up the two culprit paragraphs.
Redoing the OCR proved successful, to a point.

It appears as though the pdfGoHTML isn’t picking up the “Actual Text” of
the tag. I tried multiple approaches. If I use Actual Text at all, the
content is completely hidden. I’ve gone so far as to test what would
happen if I just made some of the text an image and applied the Actual
Text. However, if I use Alternate text, it shows up in the conversion.

At this point I’m just trying to deduce if pdfGoHTML is having this
problem, or if the file is still screwy. I tried to see if there was a
feature request option on Callas, but I couldn’t figure out where to look
for that regarding free software. It would be cool if it replaced the
error content that is tagged with Actual Text as the actual text that’s
supposed to be read. Of course, it might still do that but I’ve still got
a bad file regardless.

Any thoughts?

Jonathan