WebAIM - Web Accessibility In Mind

E-mail List Archives

Thread: HTML vs. PDF - which takes less time and resources?

for

Number of posts in this thread: 8 (In chronological order)

From: Rabab Gomaa
Date: Thu, May 09 2013 12:15PM
Subject: HTML vs. PDF - which takes less time and resources?
No previous message | Next message →

Hello,

I am wondering if there are members proficient in producing accessible PDFs in the group to answer my question or guide me to useful resources about the topic.

I am comparing HTML to PDF.
Do you think tagging PDF takes more or less time and resources compared to tagging HTML?
The question in a different way, delivering an accessible primary version would be much quicker in PDF or HTML providing that the original file exist in word format with 20 pages full of sections , subsections, complex data tables and complex images.

Thank you,
Rabab Gomaa

From: Denis Boudreau
Date: Thu, May 09 2013 12:41PM
Subject: Re: HTML vs. PDF - which takes less time and resources?
← Previous message | Next message →

Hello Rabab,

It might be a little outdated now (and since I know you can read French), I would guide you towards documents we created long ago in the process of writing the Quebec accessibility standards: <http://www.tresor.gouv.qc.ca/fileadmin/PDF/ressources_informationnelles/AccessibiliteWeb/guide_pdf_html.pdf>;. It's a document I'd written back in early 2010 and that has apparently been re-edited last year by someone else. It benchmarked HTML and PDF over a series of 19 criteria, such as accessibility potential, universality of format, availability of expertise, reliability of methodology, simplicity, costs, etc. I think some of your answers could be found in that document.

Page 6 talks about qualified resources. Page 10 among others talks about cost of producing accessible HTML and PDF. This data was gathered through various testing we conducted at the time.

The main intent was about comparing HTML and PDF and incentives to choose one or the other. I'll admit to having a bias towards HTML at the time, but quite frankly, I wouldn't be so sure anymore. I think the most important factor to consider is the quality of your source and your conversion process - if both of them are top notch, then PDFs can be very accessible indeed. Maybe not as much as PDF of course, but accessible enough to be a meaningful solution all the same.

Hope this helps!

/Denis




On 2013-05-09, at 2:15 PM, "Rabab Gomaa" < = EMAIL ADDRESS REMOVED = > wrote:

> Hello,
>
> I am wondering if there are members proficient in producing accessible PDFs in the group to answer my question or guide me to useful resources about the topic.
>
> I am comparing HTML to PDF.
> Do you think tagging PDF takes more or less time and resources compared to tagging HTML?
> The question in a different way, delivering an accessible primary version would be much quicker in PDF or HTML providing that the original file exist in word format with 20 pages full of sections , subsections, complex data tables and complex images.
>
> Thank you,
> Rabab Gomaa
>
> > >

From: Olaf Drümmer
Date: Thu, May 09 2013 2:00PM
Subject: Re: HTML vs. PDF - which takes less time and resources?
← Previous message | Next message →

Hi Rabab,

I would like to share a post here that I have just submitted to a related question on the BCAB mailing list. It addresses to a certain degree the general part of your question, not the specific part (regarding your 20 page non-trivial word document), so take it with a grain of salt...:


Hi,

before you try to get your question answered, please ask a few more questions:

[1] what is your starting point? What is the quality of the material if it already exists, or what will it be once it gets created? Is it created specifically for the purpose of distributing it in an accessible fashion? Is it really just targeting people with no or limited vision, or do other types of disability also play a role? Is it for elder people (often not so much inclined to deal with complex IT) or younger people (who don't mind learning/acquiring yet another piece of technology)? Will the content be consumed during work time at a desktop computer, or maybe also frequently on a mobile device or a laptop? Does it have to work on possibly outdated equipment? Is it really just for people with disabilities, or should it also be decently accessible to people without disabilities? Does it have to be absolutely the same for everyone, maybe out of legal considerations (how do you ensure that a conversion process from an inaccessible presentation form to a more accessible pre
sentation form does not change or drop some of the content; what if in your life insurance contract or your medication formation a zero is dropped)?

[2] what type of content is it? how is it going to be consumed by the target user group? Is it a lot of content but more for consultational reading, where you need to find stuff quickly and easily, but will only read limited portions of content? Then a lot speaks for an HTML approach along the lines of Wikipedia or similar. Is it intended to be read in one go? Then maybe EPUB is an option? Is it really a possibly large set of documents with more internal structure and presentational variation, e.g. tables, graphics - then I'd claim a well tagged PDF is the way to go.

[3] From my point of view, it is less important what format you use, but much more important whether you prepare content adequately. Failing to mark up headings or tables properly will leave you with inaccessible content regardless which format you choose. Thus the question is very important how likely it is that given your starting point/quality of existing material, skill level of people involved, tools available, budget and time available etc. that a certain format will probably be more or less accessible.

Just to give some examples: a magazine publisher will typically work with programs like Adobe InDesign. From InDesign (using recent versions of the program) it is pretty feasible and economic to create accessible PDF or EPUB, but much harder to get HTML right or even export to Word well. If you are in a corporate or government environment, often Word is used, and as long as you want to map one Word document to one piece of content provided, your options essentially are to just distribute it as a Word file or as PDF (using Word and Adobe Acrobat you can create well tagged PDF right away). If you have an XML or HTML based content repository, maybe a web based editorial content management system, then HTML might be the easiest way forward. In all cases though - and this is more often overlooked than not - if the source content you have is not well structured, you will most probably either get garbage regardless of the format, or you will have to invest a lot to get it right.

[4] New and not so new tools and assistive technology
While it can be challenging for people with disabilities to catch up with technological developments we have to take into account the fact that tools are getting better (and often less expensive). Just to mention a few recent developments:
- the free NVDA screen reader for Windows now handles tagged PDF very well; for example, in the latest release, extended and much improved table navigation features were added; so this helps with accessing well tagged PDF files
- Amazon at the beginning of May 2013 released a Kindle app for iOS that is really quite accessible - this helps with reading EPUBs (note: while Amazon favors their own MOBI-derived proprietary format, you can send your EPUB by email to you Kindle account, and it will be converted such that it can be read on your Kindle device or app)
- iBooks on iOS (iPod, iPad, iPhone) has been very accessible for EPUB for a while, PDF unfortunately is lagging behind
- the free callas pdfGoHTML plug-in for Adobe Acrobat on Mac or Windows converts tagged PDF into HTML and opens the HTML in the default browser, using easier the user defined CSS styles or offers a couple of styles for low vision or dyslexic users. Where Adobe Reader's reflow has very unfortunate limitations, callas pdfGoHTML ultimately offers a content reflow mechanism, running in your favorite browser, where just about any aspect of the content presentation can be adjusted to the liking o f the user.

Of course, if a user prefers to continue to work with a pdftotext tool from 2001 or JAWS version 5 or Lynx - these are valid choices - but it is to be asked whether others should make substantial investments to accommodate the idiosyncrasies of such essentially outdated technology, especially given some more capable recent options tend to be available more or less free of charge.

Olaf





Am 9 May 2013 um 20:15 schrieb Rabab Gomaa:

> Hello,
>
> I am wondering if there are members proficient in producing accessible PDFs in the group to answer my question or guide me to useful resources about the topic.
>
> I am comparing HTML to PDF.
> Do you think tagging PDF takes more or less time and resources compared to tagging HTML?
> The question in a different way, delivering an accessible primary version would be much quicker in PDF or HTML providing that the original file exist in word format with 20 pages full of sections , subsections, complex data tables and complex images.
>
> Thank you,
> Rabab Gomaa
>
> > >

From: Ryan E. Benson
Date: Thu, May 09 2013 5:50PM
Subject: Re: HTML vs. PDF - which takes less time and resources?
← Previous message | Next message →

Rabab,

I pretty much agree on what Olaf said.

Rebab said
> Do you think tagging PDF takes more or less time and resources compared
to tagging HTML?

[RB]All depends on what you need to do. If you need to develop a the
layout, a PDF might be faster. I favor HTML over PDF personally. You have
to do a lot of the things to make a PDF accessible with a mouse, so for
somebody who doesn't use a mouse a lot - this adds a fair amount of time.

Rebab said:
> The question in a different way, delivering an accessible primary version
would be much quicker in PDF or HTML providing that the original file exist
in word format with 20 pages full of sections , subsections, complex data
tables and complex images.

Sections are usually done with various heading levels. If you correctly
apply the Heading styles correctly, both built in Microsoft save as PDF,
and Acrobat's save add-in functionality, the headings get carried over to
the PDF. PDFs have tags, just like HTML (list:
http://alistapart.com/d/pdf_accessibility/PDFtags.html), and some are like
HTML5 tags. Complex tables are a bear in both formats. For complex images,
what do you need to display with them? If you need to draw attention to
individual parts of an image, this is probably slightly easier in a PDF.
There are tools that make converting from word to PDF or remediating PDFs
easier, one of which is the CommonLook products.

--
Ryan E. Benson


On Thu, May 9, 2013 at 4:00 PM, Olaf Drümmer < = EMAIL ADDRESS REMOVED = > wrote:

> Hi Rabab,
>
> I would like to share a post here that I have just submitted to a related
> question on the BCAB mailing list. It addresses to a certain degree the
> general part of your question, not the specific part (regarding your 20
> page non-trivial word document), so take it with a grain of salt...:
>
>
> Hi,
>
> before you try to get your question answered, please ask a few more
> questions:
>
> [1] what is your starting point? What is the quality of the material if it
> already exists, or what will it be once it gets created? Is it created
> specifically for the purpose of distributing it in an accessible fashion?
> Is it really just targeting people with no or limited vision, or do other
> types of disability also play a role? Is it for elder people (often not so
> much inclined to deal with complex IT) or younger people (who don't mind
> learning/acquiring yet another piece of technology)? Will the content be
> consumed during work time at a desktop computer, or maybe also frequently
> on a mobile device or a laptop? Does it have to work on possibly outdated
> equipment? Is it really just for people with disabilities, or should it
> also be decently accessible to people without disabilities? Does it have to
> be absolutely the same for everyone, maybe out of legal considerations (how
> do you ensure that a conversion process from an inaccessible presentation
> form to a more accessible pre
> sentation form does not change or drop some of the content; what if in
> your life insurance contract or your medication formation a zero is
> dropped)?
>
> [2] what type of content is it? how is it going to be consumed by the
> target user group? Is it a lot of content but more for consultational
> reading, where you need to find stuff quickly and easily, but will only
> read limited portions of content? Then a lot speaks for an HTML approach
> along the lines of Wikipedia or similar. Is it intended to be read in one
> go? Then maybe EPUB is an option? Is it really a possibly large set of
> documents with more internal structure and presentational variation, e.g.
> tables, graphics - then I'd claim a well tagged PDF is the way to go.
>
> [3] From my point of view, it is less important what format you use, but
> much more important whether you prepare content adequately. Failing to mark
> up headings or tables properly will leave you with inaccessible content
> regardless which format you choose. Thus the question is very important how
> likely it is that given your starting point/quality of existing material,
> skill level of people involved, tools available, budget and time available
> etc. that a certain format will probably be more or less accessible.
>
> Just to give some examples: a magazine publisher will typically work with
> programs like Adobe InDesign. From InDesign (using recent versions of the
> program) it is pretty feasible and economic to create accessible PDF or
> EPUB, but much harder to get HTML right or even export to Word well. If you
> are in a corporate or government environment, often Word is used, and as
> long as you want to map one Word document to one piece of content provided,
> your options essentially are to just distribute it as a Word file or as
> PDF (using Word and Adobe Acrobat you can create well tagged PDF right
> away). If you have an XML or HTML based content repository, maybe a web
> based editorial content management system, then HTML might be the easiest
> way forward. In all cases though - and this is more often overlooked than
> not - if the source content you have is not well structured, you will most
> probably either get garbage regardless of the format, or you will have to
> invest a lot to get it right.
>
> [4] New and not so new tools and assistive technology
> While it can be challenging for people with disabilities to catch up with
> technological developments we have to take into account the fact that tools
> are getting better (and often less expensive). Just to mention a few recent
> developments:
> - the free NVDA screen reader for Windows now handles tagged PDF very
> well; for example, in the latest release, extended and much improved table
> navigation features were added; so this helps with accessing well tagged
> PDF files
> - Amazon at the beginning of May 2013 released a Kindle app for iOS that
> is really quite accessible - this helps with reading EPUBs (note: while
> Amazon favors their own MOBI-derived proprietary format, you can send your
> EPUB by email to you Kindle account, and it will be converted such that it
> can be read on your Kindle device or app)
> - iBooks on iOS (iPod, iPad, iPhone) has been very accessible for EPUB for
> a while, PDF unfortunately is lagging behind
> - the free callas pdfGoHTML plug-in for Adobe Acrobat on Mac or Windows
> converts tagged PDF into HTML and opens the HTML in the default browser,
> using easier the user defined CSS styles or offers a couple of styles for
> low vision or dyslexic users. Where Adobe Reader's reflow has very
> unfortunate limitations, callas pdfGoHTML ultimately offers a content
> reflow mechanism, running in your favorite browser, where just about any
> aspect of the content presentation can be adjusted to the liking o f the
> user.
>
> Of course, if a user prefers to continue to work with a pdftotext tool
> from 2001 or JAWS version 5 or Lynx - these are valid choices - but it is
> to be asked whether others should make substantial investments to
> accommodate the idiosyncrasies of such essentially outdated technology,
> especially given some more capable recent options tend to be available more
> or less free of charge.
>
> Olaf
>
>
>
>
>
> Am 9 May 2013 um 20:15 schrieb Rabab Gomaa:
>
> > Hello,
> >
> > I am wondering if there are members proficient in producing accessible
> PDFs in the group to answer my question or guide me to useful resources
> about the topic.
> >
> > I am comparing HTML to PDF.
> > Do you think tagging PDF takes more or less time and resources compared
> to tagging HTML?
> > The question in a different way, delivering an accessible primary
> version would be much quicker in PDF or HTML providing that the original
> file exist in word format with 20 pages full of sections , subsections,
> complex data tables and complex images.
> >
> > Thank you,
> > Rabab Gomaa
> >
> > > > > > >
> > > >

From: Paul J. Adam
Date: Thu, May 09 2013 6:12PM
Subject: Re: HTML vs. PDF - which takes less time and resources?
← Previous message | Next message →

I think PDF takes longer to make accessible. Manually tagging elements for accessibility in PDF takes much longer than in HTML.

You can paste accessible content into Dreamweaver from Word and it will preserve the headings and table structure. You'll have to tag the table row/column headers and Dreamweaver will prompt you to enter the alt text for each image.

Also if you're concerned about universal accessibility and want the content to work on iPhone/iPad/Mac then PDFs are not an option. HTML is really the only universally accessible format currently due to lack of access to tagged content in PDFs on Apple's platforms.

Paul J. Adam
Accessibility Evangelist
www.deque.com

On May 9, 2013, at 1:15 PM, Rabab Gomaa < = EMAIL ADDRESS REMOVED = > wrote:

> Hello,
>
> I am wondering if there are members proficient in producing accessible PDFs in the group to answer my question or guide me to useful resources about the topic.
>
> I am comparing HTML to PDF.
> Do you think tagging PDF takes more or less time and resources compared to tagging HTML?
> The question in a different way, delivering an accessible primary version would be much quicker in PDF or HTML providing that the original file exist in word format with 20 pages full of sections , subsections, complex data tables and complex images.
>
> Thank you,
> Rabab Gomaa
>
> > >

From: Chagnon | PubCom
Date: Thu, May 09 2013 10:19PM
Subject: Re: HTML vs. PDF - which takes less time and resources?
← Previous message | Next message →

Rabab wrote: "I am comparing HTML to PDF. Do you think tagging PDF takes
more or less time and resources compared to tagging HTML?
The question in a different way, delivering an accessible primary version
would be much quicker in PDF or HTML providing that the original file exist
in word format with 20 pages full of sections , subsections, complex data
tables and complex images."

It's difficult to compare the 2 formats, HTML and PDF, as they are quite
different from each other. And with PDFs, you should be doing most of the
accessibility work in the source program such as Word, not moving around
tags in Acrobat or correcting the reading order. Remediating a PDF is
mind-numbing, slow, and tedious.

The 20-page document you describe could be easy to format and make
accessible in Word (and in its exported PDF), or it could be difficult.
Depends upon whether you've done any of the Word-to-PDF no-nos, namely using
text boxes and certain forms of wrapped text around graphics that aren't
accessible in either Word or PDF.

And then there's that wonderful "feature" where all of the PDF's graphics
will be dropped at the top of the tag tree when exported from Word. Not sure
whether who should be shamed for this, Microsoft or Adobe, but it's a
senseless flaw.

On the other hand, footnotes, cross-references, and active tables of content
can be relatively easy to create in Word/PDF compared with HTML coding.

HTML does give more coding options for certain things, especially with
tables. About all you can do in Word/PDF is designate header rows and
columns, and of course, construct it correctly.

But the key words in your question, to me, are "delivering an accessible
primary version." Given today's technology, I think primary versions should
be HTML whenever possible as they provide the best accessibility features,
such as scaling text, reflow, skipnav, and navigation options, that aren't
as robust or even available in either Word or PDF.

Word and PDF should be used only when a more structured or visually designed
document is required.

--Bevi Chagnon

- PubCom.com - Trainers, Consultants, Designers, and Developers.
- Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
Accessibility.
- It's our 32nd year!

From: Andrew Kirkpatrick
Date: Fri, May 10 2013 9:04AM
Subject: Re: HTML vs. PDF - which takes less time and resources?
← Previous message | Next message →

A few comments...

It's difficult to compare the 2 formats, HTML and PDF, as they are quite different from each other. And with PDFs, you should be doing most of the accessibility work in the source program such as Word, not moving around tags in Acrobat or correcting the reading order. Remediating a PDF is mind-numbing, slow, and tedious.

[AWK] I won't disagree that this is ever true, but remediating PDF is not always as you describe. In some cases it is no more difficult than repairing equivalent HTML documents. Both take work and depending on the level of complexity, it can be tough.

And then there's that wonderful "feature" where all of the PDF's graphics will be dropped at the top of the tag tree when exported from Word. Not sure whether who should be shamed for this, Microsoft or Adobe, but it's a senseless flaw.

[AWK] http://support.microsoft.com/kb/2701086

HTML does give more coding options for certain things, especially with tables. About all you can do in Word/PDF is designate header rows and columns, and of course, construct it correctly.

[AWK] That's all you can do in Word, but in PDF you can define row/column headers, use scope or headers/id, and handle spanned cells.

AWK

From: Alastair Campbell
Date: Mon, May 13 2013 4:29AM
Subject: Re: HTML vs. PDF - which takes less time and resources?
← Previous message | No next message

It is a difficult comparison, I have generally found it more likely that
organisations will consistently create accessible HTML websites compared to
PDFs, but it very much depends on the authoring process in each case.

I wrote more about it here:
http://alastairc.ac/2011/01/pdf-vs-html-for-organisations/

Cheers,

-Alastair