WebAIM - Web Accessibility In Mind

E-mail List Archives

Thread: Does PDF.JS render acccessible PDF experience?

for

Number of posts in this thread: 14 (In chronological order)

From: Birkir R. Gunnarsson
Date: Tue, Jun 18 2024 5:51AM
Subject: Does PDF.JS render acccessible PDF experience?
No previous message | Next message →

I'm working with a group that is considering using this solution for PDF
files:
https://github.com/mozilla/pdf.js#online-demo


I'm curious if anyone has ever worked with it, i.e. fed it an accessible
(fully tagged) PDF file and verified that it renders an accessible screen
reader user experience.
I've seen a couple of examples that look less than accessible but I found
those in the wil and have not verified the PDF files themselves.
Ultimately I'll have to build this and try it out, but if someone already
has experience with it, I always appreciate knowing what to expect. ;)
Thanks
-Birkir
--
Work hard. Have fun. Make history.

From: Jon Metz
Date: Tue, Jun 18 2024 11:09AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

Hi Birkir!

PDF.js parses a PDF using the HTML5 Canvas element. It does not look at the tag structure of the PDF (probably for security purposes). I believe that if you also download the PDF from the browser, it is the same as 'printing' it from the browser, so it's not the same file that was properly tagged.

Here's an article that goes into better detail about how pdf.js works:

https://pdfjs.express/blog/how-pdf-js-works

HTH!
Jon Metz

From: Duff Johnson
Date: Tue, Jun 18 2024 11:18AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

> On Jun 18, 2024, at 13:09, Jon Metz < = EMAIL ADDRESS REMOVED = > wrote:
>
> Hi Birkir!
>
> PDF.js parses a PDF using the HTML5 Canvas element. It does not look at the tag structure of the PDF (probably for security purposes).

That's interesting; I'm not aware of any security reason to ignore PDF tags.

If it's true that PDF.js ignores tags then, ipso facto, PDF.js cannot be used where accessibility is required.

Duff.

>
>

From: Jon Metz
Date: Tue, Jun 18 2024 12:10PM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

Well, I don't know if it's for security reasons. I just assumed "security" was why Adobe limits access to the structure in their Reader app. Or any useful 3rd party plug in for that matter.

But you'd probably know better than I would! :)

-Jon

From: Duff Johnson
Date: Tue, Jun 18 2024 12:27PM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

> On Jun 18, 2024, at 14:10, Jon Metz < = EMAIL ADDRESS REMOVED = > wrote:
>
> Well, I don't know if it's for security reasons. I just assumed "security" was why Adobe limits access to the structure in their Reader app.

So far as I know Adobe doesn't limit access to the structure in any way…

Duff.

>
>

From: Malthe Jepsen
Date: Wed, Jun 19 2024 3:15AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

HI
According to this Bugzilla page, pdf.js does indeed support tagged PDF. It's the default viewer in Firefox, and after a cursory inspection of a tagged PDF in Firefox with a screen reader, it seems to render well.
https://bugzilla.mozilla.org/show_bug.cgi?id†1157

Best
Malthe
> On 18 Jun 2024, at 20.27, Duff Johnson < = EMAIL ADDRESS REMOVED = > wrote:
>
>
>> On Jun 18, 2024, at 14:10, Jon Metz < = EMAIL ADDRESS REMOVED = > wrote:
>>
>> Well, I don't know if it's for security reasons. I just assumed "security" was why Adobe limits access to the structure in their Reader app.
>
> So far as I know Adobe doesn't limit access to the structure in any way…
>
> Duff.
>
>>
>>

From: Philip Kiff
Date: Wed, Jun 19 2024 7:56AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

I think Malthe is right that pdf.js generally tries to support tagged PDF.

My understanding is that the Mozilla team have been actively working on
improving that support over the past year or two. For example, I know
that last year they worked on trying to maintain the tag order and
structures after you "edit" (more properly annotate) a PDF through the
Firefox browser. And their current issue queue includes a variety of
issues that relate to proper rendering of PDF features required for
accessible PDFs - for example, proper processing of merged table header
cells in tables with more than one row header
(https://github.com/mozilla/pdf.js/issues/18090).

The level of tag support in pdf.js is an active, moving target. But my
impression is that folks who use screen readers still usually prefer to
open PDFs in external PDF reader software?

It would be interesting to see someone do a full comparison of the level
of accessibility support provided by Firefox vs Chrome vs Acrobat.

Phil.

Philip Kiff
D4K Communications

On 2024-06-19 5:15 a.m., Malthe Jepsen wrote:
> HI
> According to this Bugzilla page, pdf.js does indeed support tagged PDF. It's the default viewer in Firefox, and after a cursory inspection of a tagged PDF in Firefox with a screen reader, it seems to render well.
> https://bugzilla.mozilla.org/show_bug.cgi?id†1157
>
> Best
> Malthe
>> On 18 Jun 2024, at 20.27, Duff Johnson< = EMAIL ADDRESS REMOVED = > wrote:
>>
>>
>>> On Jun 18, 2024, at 14:10, Jon Metz< = EMAIL ADDRESS REMOVED = > wrote:
>>>
>>> Well, I don't know if it's for security reasons. I just assumed "security" was why Adobe limits access to the structure in their Reader app.
>> So far as I know Adobe doesn't limit access to the structure in any way…
>>
>> Duff.
>>
>>>

From: Steve Green
Date: Wed, Jun 19 2024 8:30AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

It's funny you should mention editing PDFs in Firefox, because Mozilla sent out an email this morning with the subject line "All you need to edit your PDFs". It talks about how you can use Firefox to add text and "you can add images with alt text for accessibility".



I upgraded to the latest version of Firefox and tried it, and it's nothing short of disastrous. The fact that the "Try it now" and "Read more" buttons in the email don't work makes me wonder if it's a draft that they sent out by accident.



I edited a PDF we had already made accessible and found the following:



• Editing in Firefox removed all the tags.


• It added all the new content in the Comments panel.


• You can't set the size of a text frame and the text doesn't wrap, so it flows off screen as you type. However, when you save the document, the text frame gets moved to the extreme left and the font size gets reduced to (almost) fit the page width.


• When you add an image, you are prompted to add Alternate Text. However, as far as I can tell, you never see it again – it's not in the Properties or anywhere else. NVDA can't even find any images in the document.



That said, Firefox is by far the best browser for reading PDFs with a screen reader because it recognises all the tags (as far as I can tell). By contrast, Chrome and Edge are terrible. They ignore the Tags panel and do not expose any semantics at all. They guess what the headings are, apparently based on font size, but every heading is level 2.



I am on an email forum for screen reader users, and there is absolutely no awareness of the level of accessibility support provided by different PDF readers. The topic comes up pretty much every week, and it's clear that people just use whatever the default reader turns out to be, and it's almost always a browser because they do everything they can to hijack the PDF file association. Almost no one makes a conscious choice to use a particular application. Worse still, there are all sorts of old wives’ tales about which reader application is best, almost all of which are wrong.

Steve Green
Managing Director
Test Partners Ltd




From: Nick Bromley
Date: Wed, Jun 19 2024 8:58AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

Hi Steve,

I'm interested in the forum for screen reader users you mentioned - is this a closed group/by invite only, or is there any joining info you could share?


Kind regards,

Nick
----
Director & Accessibility Consultant
Red Kite Digital Accessibility Ltd


From: Philip Kiff
Date: Wed, Jun 19 2024 9:09AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

Ha! Your mileage definitely will vary if you try to "edit" a PDF using
Firefox!

I wish Mozilla wouldn't even use the word "edit" in their marketing
materials since it's not really an editor. You're not really editing the
PDF file, so much as adding annotations on top of it, which is why they
can end up in the comments. Having said that, I tested a version of it a
few months ago and it did allow an image to be added (to the top layer
of the page) and it did allow the alternative text to be passed through
to a screen reader, and it inserted the figure tag at the end of the tag
tree for that page. I'm surprised it works at all, given the complexity
of the PDF file structure, and I'm interested to see how/if it continues
to develop.

Thanks for the additional insight into screen reader experience and
awareness. The tendency of browsers to take over PDF rendering has been
frustrating for all users, I think, and it has created new challenges to
achieving an accessible PDF experience. So many different rendering
experiences and different tag and form support.

Phil.

On 2024-06-19 10:30 a.m., Steve Green wrote:
> It's funny you should mention editing PDFs in Firefox, because Mozilla sent out an email this morning with the subject line "All you need to edit your PDFs". It talks about how you can use Firefox to add text and "you can add images with alt text for accessibility".
>
> I upgraded to the latest version of Firefox and tried it, and it's nothing short of disastrous. The fact that the "Try it now" and "Read more" buttons in the email don't work makes me wonder if it's a draft that they sent out by accident.
>
> I edited a PDF we had already made accessible and found the following:
>
> • Editing in Firefox removed all the tags.
> • It added all the new content in the Comments panel.
> • You can't set the size of a text frame and the text doesn't wrap, so it flows off screen as you type. However, when you save the document, the text frame gets moved to the extreme left and the font size gets reduced to (almost) fit the page width.
> • When you add an image, you are prompted to add Alternate Text. However, as far as I can tell, you never see it again – it's not in the Properties or anywhere else. NVDA can't even find any images in the document.
>
> That said, Firefox is by far the best browser for reading PDFs with a screen reader because it recognises all the tags (as far as I can tell). By contrast, Chrome and Edge are terrible. They ignore the Tags panel and do not expose any semantics at all. They guess what the headings are, apparently based on font size, but every heading is level 2.
>
> I am on an email forum for screen reader users, and there is absolutely no awareness of the level of accessibility support provided by different PDF readers. The topic comes up pretty much every week, and it's clear that people just use whatever the default reader turns out to be, and it's almost always a browser because they do everything they can to hijack the PDF file association. Almost no one makes a conscious choice to use a particular application. Worse still, there are all sorts of old wives’ tales about which reader application is best, almost all of which are wrong.
>
> Steve Green
> Managing Director
> Test Partners Ltd

On 2024-06-19 9:56 a.m., Philip Kiff wrote:
> I think Malthe is right that pdf.js generally tries to support tagged
> PDF.
>
> My understanding is that the Mozilla team have been actively working
> on improving that support over the past year or two. For example, I
> know that last year they worked on trying to maintain the tag order
> and structures after you "edit" (more properly annotate) a PDF through
> the Firefox browser. And their current issue queue includes a variety
> of issues that relate to proper rendering of PDF features required for
> accessible PDFs - for example, proper processing of merged table
> header cells in tables with more than one row header
> (https://github.com/mozilla/pdf.js/issues/18090).
>
> The level of tag support in pdf.js is an active, moving target. But my
> impression is that folks who use screen readers still usually prefer
> to open PDFs in external PDF reader software?
>
> It would be interesting to see someone do a full comparison of the
> level of accessibility support provided by Firefox vs Chrome vs Acrobat.
>
> Phil.
>
> Philip Kiff
> D4K Communications
>
> On 2024-06-19 5:15 a.m., Malthe Jepsen wrote:
>> HI
>> According to this Bugzilla page, pdf.js does indeed support tagged
>> PDF. It's the default viewer in Firefox, and after a cursory
>> inspection of a tagged PDF in Firefox with a screen reader, it seems
>> to render well.
>> https://bugzilla.mozilla.org/show_bug.cgi?id†1157
>>
>> Best
>> Malthe
>>> On 18 Jun 2024, at 20.27, Duff Johnson< = EMAIL ADDRESS REMOVED = >  wrote:
>>>
>>>
>>>> On Jun 18, 2024, at 14:10, Jon Metz< = EMAIL ADDRESS REMOVED = >  wrote:
>>>>
>>>> Well, I don't know if it's for security reasons. I just assumed
>>>> "security" was why Adobe limits access to the structure in their
>>>> Reader app.
>>> So far as I know Adobe doesn't limit access to the structure in any
>>> way…
>>>
>>> Duff.
>>>
>>>>

From: Steve Green
Date: Wed, Jun 19 2024 9:25AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

Hi Nick,

I just replied to you off-list.

Steve

From: Steve Green
Date: Wed, Jun 19 2024 9:39AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

I'd love to know how you managed to add an image with alt text. I was expecting to be able to do that, but I can't find a way to do so. When I autotag the document after adding an image, the image itself is not visible in the Tags or Content panel. Instead, it appears as a Stamp-OBJR inside an <Annot> tag.

Steve

From: Duff Johnson
Date: Wed, Jun 19 2024 9:55AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | Next message →

HI Steve,

Thanks for this great report.

Related: this recent article on pdfa.org by our CTO, Peter Wyatt, provides a comparison of how browsers handle PDF's Fragment Identifiers feature… and their support for basic PDF navigation features in general.

https://pdfa.org/pdf-fragment-identifiers/

Sadly, although all browsers support some Fragment Identifiers none support the ability to target a structure element (tag).

Duff.

> On Jun 19, 2024, at 10:30, Steve Green < = EMAIL ADDRESS REMOVED = > wrote:
>
> It's funny you should mention editing PDFs in Firefox, because Mozilla sent out an email this morning with the subject line "All you need to edit your PDFs". It talks about how you can use Firefox to add text and "you can add images with alt text for accessibility".
>
>
>
> I upgraded to the latest version of Firefox and tried it, and it's nothing short of disastrous. The fact that the "Try it now" and "Read more" buttons in the email don't work makes me wonder if it's a draft that they sent out by accident.
>
>
>
> I edited a PDF we had already made accessible and found the following:
>
>
>
> • Editing in Firefox removed all the tags.
>
>
> • It added all the new content in the Comments panel.
>
>
> • You can't set the size of a text frame and the text doesn't wrap, so it flows off screen as you type. However, when you save the document, the text frame gets moved to the extreme left and the font size gets reduced to (almost) fit the page width.
>
>
> • When you add an image, you are prompted to add Alternate Text. However, as far as I can tell, you never see it again – it's not in the Properties or anywhere else. NVDA can't even find any images in the document.
>
>
>
> That said, Firefox is by far the best browser for reading PDFs with a screen reader because it recognises all the tags (as far as I can tell). By contrast, Chrome and Edge are terrible. They ignore the Tags panel and do not expose any semantics at all. They guess what the headings are, apparently based on font size, but every heading is level 2.
>
>
>
> I am on an email forum for screen reader users, and there is absolutely no awareness of the level of accessibility support provided by different PDF readers. The topic comes up pretty much every week, and it's clear that people just use whatever the default reader turns out to be, and it's almost always a browser because they do everything they can to hijack the PDF file association. Almost no one makes a conscious choice to use a particular application. Worse still, there are all sorts of old wives’ tales about which reader application is best, almost all of which are wrong.
>
> Steve Green
> Managing Director
> Test Partners Ltd
>
>
>
>
>

From: Philip Kiff
Date: Wed, Jun 19 2024 11:32AM
Subject: Re: Does PDF.JS render acccessible PDF experience?
← Previous message | No next message

Mmmm....re-testing now, and I don't seem to be able to reproduce what I
said I did. I guess it must have been a personal hallucination (or
misremembering after too many duplicate edits). It looks like you're
right and that none of the Firefox edits get transferred over to the tag
tree or regular content containers: everything remains in annotation.
Which I assume is why the alternative text doesn't appear anywhere,
since there's no provision for it in annotations.

Drats.

Phil.

On 2024-06-19 11:39 a.m., Steve Green wrote:
> I'd love to know how you managed to add an image with alt text. I was expecting to be able to do that, but I can't find a way to do so. When I autotag the document after adding an image, the image itself is not visible in the Tags or Content panel. Instead, it appears as a Stamp-OBJR inside an <Annot> tag.
>
> Steve
>
>