E-mail List Archives
Re: Does pdfGoHTML not recognize Actual Text?
From: Duff Johnson
Date: Jul 18, 2013 8:57AM
- Next message: Olaf Drümmer: "Re: Does pdfGoHTML not recognize Actual Text?"
- Previous message: Jonathan Metz: "Does pdfGoHTML not recognize Actual Text?"
- Next message in Thread: Olaf Drümmer: "Re: Does pdfGoHTML not recognize Actual Text?"
- Previous message in Thread: Jonathan Metz: "Does pdfGoHTML not recognize Actual Text?"
- View all messages in this Thread
> I feel like an idiot for not having done pdfGoHTML sooner.
:)
> At least that
> way Id know that the first page was somehow getting hidden from
> everything!
It's great for a "quick sanity check" on the file.
> I ran it and it wasnt picking up the two culprit paragraphs.
> redoing the OCR proved successful, to a point.
> It appears as though the pdfGoHTML isnt picking up the Actual Text of
> the tag. I tried multiple approaches. If I use Actual Text at all, the
> content is completely hidden. Ive gone so far as to test what would
> happen if I just made some of the text an image and applied the Actual
> Text. However, if I use Alternate text, it shows up in the conversion.
What happened when you used Acrobat to export the file to HTML, as Olaf had suggested? Compare that result to pdfGoHTML...
> At this point Im just trying to deduce if pdfGoHTML is having this
> problem, or if the file is still screwy.
Try the test I mentioned above.
Duff.
- Next message: Olaf Drümmer: "Re: Does pdfGoHTML not recognize Actual Text?"
- Previous message: Jonathan Metz: "Does pdfGoHTML not recognize Actual Text?"
- Next message in Thread: Olaf Drümmer: "Re: Does pdfGoHTML not recognize Actual Text?"
- Previous message in Thread: Jonathan Metz: "Does pdfGoHTML not recognize Actual Text?"
- View all messages in this Thread