WebAIM - Web Accessibility In Mind

E-mail List Archives

Thread: HTML/CSS-to-PDF-engines that Produce Tagged PDFs

for

Number of posts in this thread: 7 (In chronological order)

From: Brandon Keith Biggs
Date: Sun, Mar 31 2019 8:49AM
Subject: HTML/CSS-to-PDF-engines that Produce Tagged PDFs
No previous message | Next message →

Hello,
Does anyone know of an HTML/CSS-to-PDF-engine that produces a properly
tagged PDF?
One would think all the engines would do this, but the demo pages for:
weasyprint <https://weasyprint.org/samples/> and
prince <https://www.princexml.com/samples/>
are not tagged.
Thanks,

Brandon Keith Biggs <http://brandonkeithbiggs.com/>;

From: Philip Kiff
Date: Sun, Mar 31 2019 11:55AM
Subject: Re: HTML/CSS-to-PDF-engines that Produce Tagged PDFs
← Previous message | Next message →

I'm not sure if there are any open source engines that currently do
this. I'd like to know if there are.

There are a couple companies that sell products that do this. One
example is iText:
https://itextpdf.com/

I have not personally used this product, but I have seen it used
successfully (along with careful scripting) to convert simple CSS + HTML
into a (mostly) properly tagged PDF/UA.

Phil.

On 2019-03-31 10:49, Brandon Keith Biggs wrote:
> Hello,
> Does anyone know of an HTML/CSS-to-PDF-engine that produces a properly
> tagged PDF?
> One would think all the engines would do this, but the demo pages for:
> weasyprint <https://weasyprint.org/samples/> and
> prince <https://www.princexml.com/samples/>
> are not tagged.
> Thanks,
>
> Brandon Keith Biggs <http://brandonkeithbiggs.com/>;
> > > >

From: Philip Kiff
Date: Sun, Mar 31 2019 12:06PM
Subject: Re: HTML/CSS-to-PDF-engines that Produce Tagged PDFs
← Previous message | Next message →

I guess OpenPDF is the open source version of iText, started from a fork
a few years ago. I don't know its capacity for converting HTML to tagged
PDF, or how cleanly it does it, but it may be worth investigating?:
https://github.com/LibrePDF/OpenPDF/

On 2019-03-31 13:55, Philip Kiff wrote:
> I'm not sure if there are any open source engines that currently do
> this. I'd like to know if there are.
>
> There are a couple companies that sell products that do this. One
> example is iText:
> https://itextpdf.com/
>
> I have not personally used this product, but I have seen it used
> successfully (along with careful scripting) to convert simple CSS +
> HTML into a (mostly) properly tagged PDF/UA.
>
> Phil.
>
> On 2019-03-31 10:49, Brandon Keith Biggs wrote:
>> Hello,
>> Does anyone know of an HTML/CSS-to-PDF-engine that produces a properly
>> tagged PDF?
>> One would think all the engines would do this, but the demo pages for:
>> weasyprint <https://weasyprint.org/samples/> and
>> prince <https://www.princexml.com/samples/>
>> are not tagged.
>> Thanks,
>>
>> Brandon Keith Biggs <http://brandonkeithbiggs.com/>;
>> >> >> >> > > > >

From: Detlev Fischer
Date: Mon, Apr 01 2019 8:45AM
Subject: Re: HTML/CSS-to-PDF-engines that Produce Tagged PDFs
← Previous message | Next message →

We use PDF Lib https://www.pdflib.com/ - not sure how good it is though.
Detlev

Am 31.03.2019 um 16:49 schrieb Brandon Keith Biggs:
> Hello,
> Does anyone know of an HTML/CSS-to-PDF-engine that produces a properly
> tagged PDF?
> One would think all the engines would do this, but the demo pages for:
> weasyprint <https://weasyprint.org/samples/> and
> prince <https://www.princexml.com/samples/>
> are not tagged.
> Thanks,
>
> Brandon Keith Biggs <http://brandonkeithbiggs.com/>;
> > > > --
Detlev Fischer
Testkreis
Werderstr. 34, 20144 Hamburg

Mobil +49 (0)157 57 57 57 45

http://www.testkreis.de
Beratung, Tests und Schulungen für barrierefreie Websites

From: chagnon
Date: Mon, Apr 01 2019 9:13AM
Subject: Re: HTML/CSS-to-PDF-engines that Produce Tagged PDFs
← Previous message | Next message →

First, PDFLib makes excellent PDF tools.
Second, no automated tool can make a perfectly accessible PDF. Still needs a trained human to review and test the PDF.

Any automated tool can tag content in a PDF.
The problems arise when the tags need to be assessed: are they the correct tags for the content (such as all P tags without any heading tags)? Are they in a logical reading order? Only a human can verify these items.

Artificial intelligence isn't very developed or intelligent at this stage of the industry, and every automated tool uses A I to autotag a file.

--Bevi Chagnon
— — —
Bevi Chagnon, founder/CEO | = EMAIL ADDRESS REMOVED =
— — —
PubCom: Technologists for Accessible Design + Publishing
consulting ' training ' development ' design ' sec. 508 services
Upcoming classes at www.PubCom.com/classes
— — —
Latest blog-newsletter – Accessibility Tips at www.PubCom.com/blog

From: Brandon Keith Biggs
Date: Mon, Apr 01 2019 9:19AM
Subject: Re: HTML/CSS-to-PDF-engines that Produce Tagged PDFs
← Previous message | Next message →

Hello,
If the HTML is properly tagged, why is there any question about the PDF?
Isn't it almost a direct conversion? I've never programmed PDF, but from
what I've seen with my screen reader, PDF tags and HTML tags are almost
identical.
Thanks,

Brandon Keith Biggs <http://brandonkeithbiggs.com/>;


On Mon, Apr 1, 2019 at 8:16 AM < = EMAIL ADDRESS REMOVED = > wrote:

> First, PDFLib makes excellent PDF tools.
> Second, no automated tool can make a perfectly accessible PDF. Still needs
> a trained human to review and test the PDF.
>
> Any automated tool can tag content in a PDF.
> The problems arise when the tags need to be assessed: are they the correct
> tags for the content (such as all P tags without any heading tags)? Are
> they in a logical reading order? Only a human can verify these items.
>
> Artificial intelligence isn't very developed or intelligent at this stage
> of the industry, and every automated tool uses A I to autotag a file.
>
> --Bevi Chagnon
> — — —
> Bevi Chagnon, founder/CEO | = EMAIL ADDRESS REMOVED =
> — — —
> PubCom: Technologists for Accessible Design + Publishing
> consulting ' training ' development ' design ' sec. 508 services
> Upcoming classes at www.PubCom.com/classes
> — — —
> Latest blog-newsletter – Accessibility Tips at www.PubCom.com/blog
>
>

From: chagnon
Date: Mon, Apr 01 2019 10:53AM
Subject: Re: HTML/CSS-to-PDF-engines that Produce Tagged PDFs
← Previous message | No next message

Brandon wrote:
"PDF tags and HTML tags are almost identical."

There are significant differences between HTML and PDF tags. Items like lists, tables, footnotes, indexes, tables of content, and many others either don't exist in HTML at all or are vastly different.

Bottom line: a PDF isn't anything like an HTML webpage, and neither are PDF tags.

—Bevi
— — —
Bevi Chagnon, founder/CEO | = EMAIL ADDRESS REMOVED =
— — —
PubCom: Technologists for Accessible Design + Publishing
consulting ' training ' development ' design ' sec. 508 services
Upcoming classes at www.PubCom.com/classes
— — —
Latest blog-newsletter – Accessibility Tips at www.PubCom.com/blog