WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Untagged PDF doc with table structure

for

From: L Snider
Date: Feb 19, 2015 6:23AM


That is a very good example, thanks. I have a few of my own, but can't use
them in public-so this is perfect!

Yes, I have found the newest version of Word to be much better, 2003 was
messy...at least some progress is being made.

Thanks again!

Lisa

On Wed, Feb 18, 2015 at 4:48 PM, Chagnon | PubCom < <EMAIL REMOVED> >
wrote:

> Lisa,
> Here's an excellent example of a flawed tag tree reading order, which then
> creates an out-of-whack structure.
> Surprisingly, it's from the US Access Board itself:
> http://www.regulations.gov/#!documentDetail;D=ATBCB-2015-0002-0001 (view
> the
> Content section and look for the PDF there).
> This is the text of the new ICT draft for Sec. 508. You'll notice in the
> tag
> tree that the figures are all stacked at the top of the tag tree...yet they
> appear in the back portion of the draft on pages 186-192.
> This error creates the following reading order:
> 1. The agency's seal/logo on page 1.
> 2. 9 illustrations on pages 186 through 192.
> 3. The title of the document (tagged with a P tag) on page 1.
> 4. The remaining pages of the document.
> This error is because they used an older version of MS Word, which does
> this
> to all graphics...stacks them at the top of the tag tree, or at the end of
> the tag tree, or anywhere it feels like it throughout the entire
> document...regardless of how someone anchors the graphics in the Word
> document itself. Word 2013, on the other hand, doesn't make this error and
> places the graphics correctly in the PDF tag tree.
> It also doesn't help that they used Acrobat 10 to create the PDF from Word.
> --Bevi
>
> -----Original Message-----
> From: <EMAIL REMOVED>
> [mailto: <EMAIL REMOVED> ] On Behalf Of L Snider
> Sent: Wednesday, February 18, 2015 3:18 PM
> To: WebAIM Discussion List
> Subject: Re: [WebAIM] Untagged PDF doc with table structure
>
> Ah okay, I see where you were going now-thanks. Yes, it is like WCAG...You
> can do it and it can still be inaccessible :)
>
> Even after all these years, most of it is still manual. Funny how things
> have changed and some things are still the same.
>
> I am loving XI Pro, because you only do the full report by default. None of
> the messiness of previous versions.
>
> Cheers
>
> Lisa
>
> On Wed, Feb 18, 2015 at 2:08 PM, Chagnon | PubCom < <EMAIL REMOVED> >
> wrote:
>
> > Yes, always run the full report in Acrobat checker and don't waste
> > your time with the other options.
> >
> > The Acrobat checker tells you if the PDF is tagged, but not if they're
> > the right tags.
> > It tells you if anything is untagged, which quite often is sidebar
> > boxes, captions, and other pieces that were left out of the tag tree.
> > Tells if any graphics are missing Alt-Text.
> > Language and file name options are also flagged if missing.
> > And sometimes it can detect when the structure might be off, such as
> > headings that appear out of order as heading 3, heading 1, heading 6.
> >
> > But even with the full report from Acrobat, you're still not getting
> > all the information you need. One reason: software can't interpret if
> > those are the right tags and if they're in the correct, logical
> > reading order. Only humans can assess that!
> >
> > --Bevi Chagnon
> >
> >
> > -----Original Message-----
> > From: <EMAIL REMOVED>
> > [mailto: <EMAIL REMOVED> ] On Behalf Of L Snider
> > Sent: Wednesday, February 18, 2015 2:51 PM
> > To: WebAIM Discussion List
> > Subject: Re: [WebAIM] Untagged PDF doc with table structure
> >
> > Hi Bevi,
> >
> > One question on this:
> > 1. Run Acrobat's accessibility checker. This looks at only about 20%
> > of the document's features, so don't depend on it for a full check.
> >
> > This is the full report and check, right? If so, what else would you
> check?
> >
> > Cheers
> >
> > Lisa
> >
> > On Wed, Feb 18, 2015 at 1:41 PM, Chagnon | PubCom < <EMAIL REMOVED> >
> > wrote:
> >
> > > Lynn, I too have a strong programming background in HTML, as well as
> > > SGML, XML, and many other markup languages. So tags plus reading
> > > order create the document's structure in my mind! In theory, I don't
> > > believe a PDF can have any structure, good or bad, without tags. All
> > > PDFs have a page architecture, but that's not the same thing as
> structure.
> > >
> > > Lynn asked: " if so how would I recognise it if I were to examine
> > > the document's building blocks "
> > >
> > > You have to examine it from several viewpoints in Acrobat Pro. I
> > > teach my students this method:
> > > 1. Run Acrobat's accessibility checker. This looks at only about 20%
> > > of the document's features, so don't depend on it for a full check.
> > >
> > > 2. Run down the tag tree, top-to-bottom. I call this the tag reading
> > order.
> > > For sighted users, they can arrow down from tag to tag and also see
> > > on the page which item is highlighted for each tag. They'll see very
> > > quickly that the figures weren't read at the correct place in the
> > > tag tree, or that the second half of body text was read first, then
> > > the heading 1, then the remaining body text.
> > >
> > > For screen reader users, this is what your software is using. But
> > > it's more difficult to tell if the document is correct. Were you
> > > able to hear and figure out what was read? Did it make sense (not
> > > the content itself, but the order in which you heard it)? Screen
> > > readers also can't tell sometimes if it's tagged correctly. Example:
> > > Adobe InDesign has a tragic flaw. When a sidebar (boxed text that's
> > > secondary to the main story) is exported to PDF, the conversion
> > > isn't correct. All of the text is jumbled together; paragraphs are
> > > lost, including any headings, bulleted lists, tables, figures, etc.
> > > So a screen reader just hears the text run-on blah blah blah, but
> > > never knows if he's reading one paragraph, multiple paragraphs,
> > > headings, or any other parts of a document. My screen reader testers
> > > often miss these problems; they just can't tell if they've missing
> > > something or if it's incorrect.
> > >
> > > 3. Run down the "real" reading order. This is the Order panel in
> Acrobat.
> > > Often overlooked by many in accessible documentation, this is the
> > > original reading order that's still used by many assistive
> > > technologies, including braille printers and keyboards. I've never
> > > had any of my screen reader testers review this because their
> > > software has a hard time voicing it in a way that makes sense to
> > > them. But they can see this reading order another way; View / Zoom /
> > > Reflow. This utility rejiggers the visual layout on the screen to
> > > mimic the real reading order. Columns are removed, everything is
> > > sequential and linear, top to bottom. So if the first item read by a
> > > screen reader happens to be the photo caption, not heading 1, then
> > > you have a reading
> > order problem.
> > >
> > > 4. After that, the usual review of tags, tables, alt-text, etc.
> > > takes place.
> > >
> > > --Bevi Chagnon
> > >
> > > -----Original Message-----
> > > From: <EMAIL REMOVED>
> > > [mailto: <EMAIL REMOVED> ] On Behalf Of Lynn
> > > Holdsworth
> > > Sent: Wednesday, February 18, 2015 1:46 PM
> > > To: WebAIM Discussion List
> > > Subject: Re: [WebAIM] Untagged PDF doc with table structure
> > >
> > > Hi Bevi,
> > >
> > > Thanks for taking the time to write such a comprehensive response.
> > >
> > > From creating HTML pages for about half a lifetime, I'd define tags
> > > and structure pretty much the way you do.
> > >
> > > But I inferred from this thread, and from talking with someone who
> > > knows a lot more about PDF than I do, that it's possible to have
> > > structure without tags in a PDF document. Is this correct, and if so
> > > how would I recognise it if I were to examine the document's
> > > building
> > blocks?
> > >
> > > Best, Lynn
> > >
> > > On 18/02/2015, Chagnon | PubCom < <EMAIL REMOVED> > wrote:
> > > > Lynn wrote: " in PDF docs, what's the difference between tags and
> > > > structure?
> > > > "
> > > >
> > > > This is one of the toughest concepts we teachers have to explain!
> > > > I'd love to hear how others describe it. Here's my take:
> > > >
> > > > Tags are labels. Code labels, specifically, that are read by
> > > > Assistive Technologies and are not usually visible to sighted
> > > > users unless they have Acrobat Pro. They let AT users know what's
> > > > a heading 2, a list of bullets, tables, and other parts of the
> > > > documents. Tags also do a lot of work for us, such as assisting us
> > > > in creating bookmarks and tables of contents, creating navigation
> > > > systems, and holding the Alt-text on graphics (Alt-Text is an
> > > > attribute on the figure tag and doesn't stand alone on its own).
> > > >
> > > > Structure is the sequence of how the document's pieces will be
> > > > read, or in other words, the sequence in which the tagged items are
> read.
> > > > Call it reading order or tag reading order. The structure of some
> > > > documents can also have nesting qualities, such as all the pieces
> > > > of a chapter, and all the chapters in a book.
> > > >
> > > > An example: If Heading 1 designates a chapter title, then all the
> > > > paragraph, bullets, tables, and heading 2 items within that
> > > > chapter will be nested inside the main heading 1 tag. This allows
> > > > AT software to figure out, hopefully, what goes with what; that
> > > > all the tags nested within Heading 1 is a chapter.
> > > >
> > > > Structure is created when you have tags (the right tag labels) and
> > > > a reading order (a logical reading order). It is possible that a
> > > > tagged and structured document might not be fully accessible
> > > > because the tags aren't accurate enough or the reading order is out
> of
> whack.
> > > >
> > > > Example number 1: In older versions of MS Word, figures would be
> > > > placed in very odd places of the reading order when it was
> > > > exported to a PDF. If paragraph 1 stated "see figure 5", figure 5
> > > > itself might end up at the very end of the reading order, not near
> > > > paragraph 1 where it was referenced. A sighted person sees figure
> > > > 5 next to the paragraph, but a screen reader user doesn't hear it
> > > > voiced until the last page, and maybe that's page 360 of a long
> > > > government document. So the document is tagged and structured, but
> > > > it's a faulty structure because the reading order is incorrect.
> > > >
> > > > Example number 2: Graphic designers who use desktop publishing
> > > > programs like Adobe InDesign and QuarkXpress create very complex
> > > > visual layouts.
> > > > Visually,
> > > > things aren't designed in a traditional top down left right
> > > > pattern but instead could be scattered all over the physical page.
> > > > Here's an example of a 2-page magazine spread:
> > > > http://fc02.deviantart.net/fs71/i/2010/082/e/c/Magazine_Layout_Des
> > > > ig n_ 1_by_B reakTheRecords.jpg (This is just a random sample I
> > > > pulled up on the Internet, so it is only a graphic of a 2-page
> > > > spread, no live text or
> > > > Alt-text.)
> > > >
> > > > Note that article title (or heading 1) appears on page 2, and the
> > > > body text of the story starts on page 1. Backwards! And then there
> > > > are 2 quotes at the top of page 1, so obviously the designer wants
> > > > us to read those at the beginning of the story, also. And here's a
> > > > similar
> > > > example:
> > > > https://m1.behance.net/rendition/modules/12455236/disp/322ee0c042b
> > > > 29
> > > > 49
> > > > 607393
> > > > d8b1f24ad96.jpg
> > > >
> > > > Whew! Getting a tagged, logical reading order from this type of
> > > > publication isn't easy!
> > > >
> > > > Summary:
> > > > Structure equals tagged content placed in a logical reading order.
> > > >
> > > > Well, that's my attempt. Would love to hear how others describe
> > > > the concepts.
> > > >
> > > > --Bevi Chagnon
> > > >
> > > > -----Original Message-----
> > > > From: <EMAIL REMOVED>
> > > > [mailto: <EMAIL REMOVED> ] On Behalf Of Lynn
> > > > Holdsworth
> > > > Sent: Wednesday, February 18, 2015 12:11 PM
> > > > To: WebAIM Discussion List
> > > > Subject: Re: [WebAIM] Untagged PDF doc with table structure
> > > >
> > > > Thanks so much everyone for weighing in - I've found this a very
> > > > useful thread indeed.
> > > >
> > > > One more question: in PDF docs, what's the difference between tags
> > > > and structure? Ryan mentioned that the doc may include structure
> > > > but not be tagged, and I don't understand the difference.
> > > >
> > > > And thanks Duff for the LinkedIn group suggestions. I'll join at
> > > > least the first one.
> > > >
> > > > Really hoping that Adobe is working on ironing out the
> > > > accessibility glitches in the DownLoad Assistant, as I'd
> > > > appreciate the chance to learn about and use what seems like a
> > > > great bunch of accessibility features in Acrobat.
> > > >
> > > > Best, Lynn
> > > >
> > > > On 18/02/2015, Andrew Kirkpatrick < <EMAIL REMOVED> > wrote:
> > > >> Bim,
> > > >> I was talking about both Acrobat and Reader in my reply, sorry if
> > > >> that wasn't clear. It is the same process for both.
> > > >> AWK
> > > >>
> > > >> -----Original Message-----
> > > >> From: <EMAIL REMOVED>
> > > >> [mailto: <EMAIL REMOVED> ] On Behalf Of Bim
> > > >> Egan
> > > >> Sent: Wednesday, February 18, 2015 7:13 AM
> > > >> To: 'WebAIM Discussion List'
> > > >> Subject: Re: [WebAIM] Untagged PDF doc with table structure
> > > >>
> > > >> Lynn didn't seem to be talking about using Acrobat though. She
> > > >> described the experience of many screen reader users in finding a
> > > >> table in an untagged PDF when opened in Reader, and she asked why
> > > >> this could happen.
> > Her
> > > >> message said that the Acrobat installation wasn't accessible.
> > > >>
> > > >> Bim
> > > >>
> > > >> -----Original Message-----
> > > >> From: <EMAIL REMOVED>
> > > >> [mailto: <EMAIL REMOVED> ] On Behalf Of Andrew
> > > >> Kirkpatrick
> > > >> Sent: 18 February 2015 14:36
> > > >> To: WebAIM Discussion List
> > > >> Subject: Re: [WebAIM] Untagged PDF doc with table structure
> > > >>
> > > >> Jon is correct. When Acrobat opens an untagged document and
> > > >> there is a client that is using the accessibility API data
> > > >> running, Acrobat (or
> > > >> Reader) will add tags to the document. The result is the same as
> > > >> if an author used the "add tags" feature in Acrobat. You get
> > > >> Acrobat's best interpretation of what the tags should be. That
> > > >> will sometimes result in headings, well-formed tables, lists, and
> > other
> > structures.
> > > >> Authors who use this feature in Acrobat know that you generally
> > > >> need to
> > > > fix some of the tags.
> > > >>
> > > >>
> > > >>
> > > >> The result is that the document is tagged temporarily and
> > > >> assistive technologies recognize and use the information.
> > > >>
> > > >>
> > > >>
> > > >> The dialogs that you see when opening PDF documents give you some
> > > >> information about what is going on. To understand better, here's
> > > >> my explanation.
> > > >>
> > > >>
> > > >>
> > > >> In acrobat or Reader preferences there is a "Reading" category.
> > > >> There is a checkbox that is labeled "Confirm before tagging
> > > >> documents". If this is checked, then every time that Reader
> > > >> intends to tag an untagged document the "Reading an untagged
> > > >> document with assistive technology" dialog pops up and the user
> > > >> needs to confirm that this is what they'd like to do. If the
> > > >> user selects cancel then the document won't be tagged and the
> > > >> reading experience will be essentially
> > > > non-existent.
> > > >>
> > > >>
> > > >>
> > > >> If you elect to allow the tagging, there are other options as
> > > >> mentioned in one of the replies. I recommend using the "infer
> > > >> reading order from document" option.
> > > >>
> > > >>
> > > >>
> > > >> There are other settings related to large documents and
> auto-tagging.
> > > >> Autotagging takes time, so if you open a very dense 600 page
> > > >> manual you may find that Reader takes a long time to do the
> > > >> tagging. It can, and we are always looking to improve the
> > > >> efficiency of this
> > > process.
> > > >> The option for the user is to indicate whether the autotagging
> > > >> should occur only on visible pages, on all pages in the document,
> > > >> or on all pages except if the document is "large". The user gets
> > > >> to define what large means - a user might find that their system
> > > >> is slow at this so sets the limit at 25 pages, or might set it
> > > >> higher if their system handles this process quickly. The down
> > > >> side of only tagging a few pages at a time is that if there are
> > > >> recognized structures on pages that haven't been tagged yet (e.g.
> > > >> a heading on page 51) the user can't use screen reader heading
> > > >> navigation to jump to it because the tags
> > > > don't exist until the page is in view in the reader.
> > > >>
> > > >>
> > > >>
> > > >> Hope this helps,
> > > >>
> > > >> AWK
> > > >>
> > > >>
> > > >>
> > > >> -----Original Message-----
> > > >> From: <EMAIL REMOVED>
> > > >> [mailto: <EMAIL REMOVED> ] On Behalf Of Lynn
> > > >> Holdsworth
> > > >> Sent: Wednesday, February 18, 2015 4:36 AM
> > > >> To: WebAIM Discussion List
> > > >> Subject: [WebAIM] Untagged PDF doc with table structure
> > > >>
> > > >>
> > > >>
> > > >> Hi all,
> > > >>
> > > >>
> > > >>
> > > >> Apologies if PDF accessibility is off topic. If so is there a
> > > >> list that covers this?
> > > >>
> > > >>
> > > >>
> > > >> But if not ...
> > > >>
> > > >>
> > > >>
> > > >> I open a PDF document, and Adobe Reader alerts me that it's
> untagged.
> > > >>
> > > >>
> > > >>
> > > >> So I begin to peruse it using JAWS, and come across a table whose
> > > >> structure is robust enough for me to move around it using the
> > > >> JAWS table
> > > > keystrokes.
> > > >>
> > > >>
> > > >>
> > > >> Does this mean there *are* tags in the document after all? Or has
> > > >> Adobe Reader used heuristics to add tags to improve the doc's
> > > >> accessibility, since my settings flag up that I'm using a
> > screenreader?
> > > >>
> > > >>
> > > >>
> > > >> I tried to download a trial version of Acrobat Pro so as to
> > > >> examine the document structure, but the download assistant seems
> inaccessible.
> > > >>
> > > >>
> > > >>
> > > >> Thanks as always, Lynn
> > > >>
> > > >> > > > >>
> > > >> > > > >> > > > >> <EMAIL REMOVED> <mailto: <EMAIL REMOVED> >
> > > >> > > > >> > > > >> > > > >>
> > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >>
> > > > > > > > > > > > list messages to <EMAIL REMOVED>
> > > >
> > > > > > > > > > > > list messages to <EMAIL REMOVED>
> > > >
> > > > > > > > > list messages to <EMAIL REMOVED>
> > >
> > > > > > > > > list messages to <EMAIL REMOVED>
> > >
> > > > > > list messages to <EMAIL REMOVED>
> >
> > > > > > list messages to <EMAIL REMOVED>
> >
> > > messages to <EMAIL REMOVED>
>
> > > >