WebAIM - Web Accessibility In Mind

E-mail List Archives

Thread: Is it True: No PDFs generated from LaTeX are Accessibly Tagged?

for

Number of posts in this thread: 2 (In chronological order)

From: Brandon Keith Biggs
Date: Sun, Apr 22 2018 4:12AM
Subject: Is it True: No PDFs generated from LaTeX are Accessibly Tagged?
No previous message | Next message →

Hello,

I think I just stumbled over one of the most egregious causes of
inaccessible PDFs on the web and I hope I'm missing something:

Pdflatex does not generate accessible PDFs.

This means that every PDF generated using the LaTeX compilers is not
properly tagged. It also means that by default, all the PDFs from pandoc
are not properly tagged. This has staggering implications in the academic
community where most PDFs are created with LaTeX.

I really hope I am wrong, but I have been using both pandoc and pdflatex
directly for the last day, trying to get one heading to show. It seems to
be impossible.

Before I go and make issues on the different LaTeX distribution sites, is
there anyone who knows more about this? If it is true that there is no way
to create accessibly tagged PDFs from pdflatex, I would love either a guide
that describes the code required for proper tagging of pDFs or someone
knowledgeable to comment on the issues that I open.

Here are a few sources that make me fear this problem has not been
addressed yet:

https://umij.wordpress.com/2016/08/11/the-sad-state-of-pdf-accessibility-of-latex-documents/



https://tex.stackexchange.com/questions/261537/a-guide-on-how-to-produce-accessible-pdf-files

http://tug.org/pipermail/accessibility/2016q4/000005.html

https://chi2014.acm.org/authors/generate-a-tagged-pdf#LaTeX



A source that shows the staggering problem this has on the academic
community:

https://www.cs.cmu.edu/~jbigham/pubs/pdfs/2015/accessibleconferences.pdf



Just to speak to the implications this has had on my life:

Over the last 4 months I have been doing a literature review. I have saved
64 articles in PDF. 21 were completely unreadable, so I had to OCR them
with Kurzweil 1000 (a $1000 piece of technology, and I still couldn't read
any tables or math). Out of the others, only 4 were properly tagged.

None of the PDFs were scanned, they all had the text there,
itjustlookedlikethis.

This means that if any blind person ever wishes to read academic papers,
they are required to have an OCR program on their computer, just to read
PDFs that were probably generated with pdflatex.

The worst part is, people just don't realize how terrible this PDF problem
is. Even my research group and wife, who know better, don't quite
understand that even if their LaTeX is perfect, or their Markdown is
perfect, their only conversion tool is broken. How can it be broken?
Everyone uses it!

If this single tool gets fixed, the percentage of inaccessible to
accessible PDFs being produced will switch over night.

Thanks,


Brandon Keith Biggs <http://brandonkeithbiggs.com/>;

From: Jonathan Cohn
Date: Sun, Apr 22 2018 5:05PM
Subject: Re: Is it True: No PDFs generated from LaTeX are Accessibly Tagged?
← Previous message | No next message

Google seems to indicate there is a Latex revision that was worked on a year ago to include accessibility meta data in the PDF.. I don't know a lot about LaTeX
So
take a look at:
https://github.com/NREL/latex_editing/blame/master/article/accessibilityMeta.sty
And I would be interested in knowing if this is the answer and if a this is just a file that can be dropped in an appropriate place or if a full build of LaTeX tools would be required.

Best wishes,

Jonathan Cohn



> On Apr 22, 2018, at 6:12 AM, Brandon Keith Biggs < = EMAIL ADDRESS REMOVED = > wrote:
>
> Hello,
>
> I think I just stumbled over one of the most egregious causes of
> inaccessible PDFs on the web and I hope I'm missing something:
>
> Pdflatex does not generate accessible PDFs.
>
> This means that every PDF generated using the LaTeX compilers is not
> properly tagged. It also means that by default, all the PDFs from pandoc
> are not properly tagged. This has staggering implications in the academic
> community where most PDFs are created with LaTeX.
>
> I really hope I am wrong, but I have been using both pandoc and pdflatex
> directly for the last day, trying to get one heading to show. It seems to
> be impossible.
>
> Before I go and make issues on the different LaTeX distribution sites, is
> there anyone who knows more about this? If it is true that there is no way
> to create accessibly tagged PDFs from pdflatex, I would love either a guide
> that describes the code required for proper tagging of pDFs or someone
> knowledgeable to comment on the issues that I open.
>
> Here are a few sources that make me fear this problem has not been
> addressed yet:
>
> https://umij.wordpress.com/2016/08/11/the-sad-state-of-pdf-accessibility-of-latex-documents/
>
>
>
> https://tex.stackexchange.com/questions/261537/a-guide-on-how-to-produce-accessible-pdf-files
>
> http://tug.org/pipermail/accessibility/2016q4/000005.html
>
> https://chi2014.acm.org/authors/generate-a-tagged-pdf#LaTeX
>
>
>
> A source that shows the staggering problem this has on the academic
> community:
>
> https://www.cs.cmu.edu/~jbigham/pubs/pdfs/2015/accessibleconferences.pdf
>
>
>
> Just to speak to the implications this has had on my life:
>
> Over the last 4 months I have been doing a literature review. I have saved
> 64 articles in PDF. 21 were completely unreadable, so I had to OCR them
> with Kurzweil 1000 (a $1000 piece of technology, and I still couldn't read
> any tables or math). Out of the others, only 4 were properly tagged.
>
> None of the PDFs were scanned, they all had the text there,
> itjustlookedlikethis.
>
> This means that if any blind person ever wishes to read academic papers,
> they are required to have an OCR program on their computer, just to read
> PDFs that were probably generated with pdflatex.
>
> The worst part is, people just don't realize how terrible this PDF problem
> is. Even my research group and wife, who know better, don't quite
> understand that even if their LaTeX is perfect, or their Markdown is
> perfect, their only conversion tool is broken. How can it be broken?
> Everyone uses it!
>
> If this single tool gets fixed, the percentage of inaccessible to
> accessible PDFs being produced will switch over night.
>
> Thanks,
>
>
> Brandon Keith Biggs <http://brandonkeithbiggs.com/>;
> > > >