WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Save Excel Table as Tagged PDF?

for

From: Philip Kiff
Date: Nov 8, 2019 5:00PM


I think what may be happening in the case of Excel is that Microsoft is
updating some versions fairly regularly while leaving other versions to
stagnate. And those more frequent changes sometimes break various
elements in the Acrobat PDF conversion engine, again depending on which
version you happen to be running.

I'm using "Version 1910" of Microsoft Excel for Office 365, and I'm on
the "Monthly" update channel, and I have machines running both 32-bit
and 64-bit versions, with the specific version numbers:
16.0.12130.202032 (32-bit)
16.0.12130.20272 (64-bit)

And I'm running the latest version of Adobe Acrobat Pro DC:
2019.021.20049

It may be that older versions of Microsoft Excel 2016, or versions that
are not on as frequent update channels, are still not producing TH's at
all.

According to Microsoft, there were changes to some PDF accessibility
functions in this channel just over a week ago, but it is hard to know
if these changes relate to differences I'm seeing in my PDF output:
https://docs.microsoft.com/en-us/officeupdates/monthly-channel-2019

I seem to get a relatively clean PDF when I export using the Microsoft
converter (via Export ->  Create PDF/XPS Document). TH's are correct in
both column and row headers of a simple table with a single header row
and first column headers. It passes the built-in Adobe checker cleanly.
There are some PDF/UA errors: the table gridlines are marked as paths in
span tags instead of being artifacted. The header row and column do not
have their "scope" set, but at least they are TH's. All in all, it seems
to me like pretty good PDF output compared to what I remember from Excel
2013 a couple years ago.

When I save as PDF using my Acrobat Professional DC ribbon plugin, I
actually get a broken table.  The table tag is empty, and the rows are
instead nested under "<UnknownNodeType>" tags. I'm sure this is a bug in
Acrobat. It's clear those same tags are actually the THead and TBody
tags in the Word export. So Microsoft probably changed the underlying
XML for that, and Acrobat is no longer converting it correctly.

Phil.

--
Philip Kiff
D4K Communications

On 2019-11-08 17:53, L Snider wrote:
> Oh and for me on the Mac, every way is a problem...I am on Insiders Fast
> Build, and it got worse about 4 months ago...I don't know where the
> mainstream version is with Mac, but I am hoping it gets better (although it
> happened on PC fast insider too)...
>
> What version of Acrobat DC is working? Mine is the latest.
>
> Cheers
>
> Lisa
>
> On Fri, Nov 8, 2019 at 10:24 AM Joseph Sherman < <EMAIL REMOVED> >
> wrote:
>
>> I cannot replicate your success. I created a simple 3x3 table, applied
>> table style with header row and export to PDF/XPS. I get no THead or TBody
>> tags, just all TR and TD cells.
>>
>>
>> Joseph
>>
>>
>>