WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Does pdfGoHTML not recognize Actual Text?

for

From: Duff Johnson
Date: Jul 18, 2013 8:57AM


> I feel like an idiot for not having done pdfGoHTML sooner.

:)

> At least that
> way I’d know that the first page was somehow getting hidden from
> everything!

It's great for a "quick sanity check" on the file.

> I ran it and it wasn’t picking up the two culprit paragraphs.
> redoing the OCR proved successful, to a point.

> It appears as though the pdfGoHTML isn’t picking up the “Actual Text” of
> the tag. I tried multiple approaches. If I use Actual Text at all, the
> content is completely hidden. I’ve gone so far as to test what would
> happen if I just made some of the text an image and applied the Actual
> Text. However, if I use Alternate text, it shows up in the conversion.

What happened when you used Acrobat to export the file to HTML, as Olaf had suggested? Compare that result to pdfGoHTML...

> At this point I’m just trying to deduce if pdfGoHTML is having this
> problem, or if the file is still screwy.

Try the test I mentioned above.

Duff.