WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: PDF Help Desk

for

From: Metzessible
Date: Sep 11, 2017 7:30AM


Hi Phil,

Thanks for reaching out. Unfortunately, I've only got stupid, selfish
reasons for using Github. Apologies this is so long, but I don't have time
to edit it to make it shorter.

To be honest, Github wasn't the natural environment I would have preferred
either, but I honestly didn't have much time to make something different.
When we moved to Western Massachusetts from Washington, DC, I sold all my
furniture, telling my wife I'd make new furniture for us if she bought me a
$3000 Sawstop. That was a year ago, stuff is still in boxes, and my wife is
becoming less patient. Further, I didn't like the existing platforms.

I made videos before joining TPG that took me forever. I just wanted to
know how to use video and captioning software, so I made them to learn
those applications. The videos are helpful, but Acrobat and Word have
changed so much since then, they're a bit outdated. I can change them or
edit as I see fit, but those bookshelves aren't going to make themselves.
Also, video is only good if the person is well known as an authority on the
subject. I specialize in a niche subject that most people tend to ignore.
If you follow me on Twitter, you'd know I mostly complain about stuff and
don't really use it to guide people to be better PDF craftspeople.

Also, those videos on YouTube are long and technical and most likely raise
questions instead of putting an end to confusion altogether. Which means
that blogs, listservs, or wikis wouldn't work either. Sure, you can have
discussions via comments or responses, but it's hard to track the context
of things. Eventually information becomes terribly scattered and makes
Twitter's threading system look appealing in comparison.

The other problem specifically with Wikis is that they are only as good as
the amount of citations they provide. As I said here and elsewhere, there
is a lot of resources out there. Often it's conflicting advice, but
sometimes it's mostly just a personal choice to use sketchy information
aggregated by a trusted third-party. A perfect example is relying on the
recommendations for remediating PDFs from WCAG. It relies on using specific
proprietary technology to make a PDF instead of actually explaining how
WCAG theoretically applies to this type of web content. Also, it leads one
to believe that making accessible PDFs can only be accomplished by using
Word. For example, I made a PDF/UA document from a terribly low resolution
scanned image so a visually impaired buddy from SSB Bart could drink beers
with us one day. That information is not in WCAG.

I wanted a place for experts to share advice for what works for them. I
also wanted a single place I can store my own advice for things I come
across in the field. The focus is less on relying on the authority of a
random reference, but instead about someone who has done it before.
Accessibility specifications generally can be applied to something that is
testable. So if advice doesn't work exactly as it's intended per a comment
in Github, we can discuss it further.

I like Github's project management tools. They're not great, but they might
work well for this scenario. When I'm bored again, I can post examples of
things that have worked for me in the Code section, and people can discuss
those in the Issues section. I wouldn't talk about the readme file, because
that thing is just to tell people what the heck this thing is.

The thing about PDF is that no matter what the error is, the solution is
generally the same no matter what. This is a bold claim to make, but
everything is based on the file format specification. The problem is
determining which problem is occurring in the first place. A 300 page PDF
might seem like it's totally corrupt, but it could just be a parsing error.
Adjusting a tag here or there might fix the whole thing.

If we look at this resource like a QA/QC system for the file format itself,
then we're tracking bugs on the process itself. Therefore logically people
would be raising issues on the process of remediating the file format. At
face value, everyone believes their problem is the first time it's
happened. Chances are it's happened before and people just gave up before
actually fixing it. Or worse, they fixed it and didn't tell anyone because
there wasn't a place to share that information. Or even worse, posted it to
StackExchange where it was promptly closed as an unrelated question because
all the answers tell the original poster to Google the answer (which is how
I usually find out that others have the same question as me - because I
Googled it).

Treat this resource like an unauthorized bug tracking system for a file
format. The file format itself is an open standard, which means that no
single entity is responsible for making changes to it. A community of
remediation experts has as much right to make a claim that the only way to
make it accessible as a company like Microsoft or Adobe does. When you
raise an issue, raise it in the Issues section as though it existed in the
Code area already. I'll be the Editor of this system, and be managed by the
community who will decide I need to do something. And when they aren't
getting what they want, they can branch this repo and make their own
community of deviant PDF experts. It could be like WHATWG, but hopefully
less drama.

Anyway, if you feel that a Wiki is the best place for this content, that
can be accomplished too. But I'm not a fan of that sort of collaboration
myself. I think Github could be a good place to track the annoyances of
working in this ridiculous file format for now.

Hope this answers your question.

Jon

On Sun, Sep 10, 2017 at 5:47 PM Philip Kiff < <EMAIL REMOVED> > wrote:

> I love the idea, Jon. I was looking for just such a resource when I
> started remediating PDFs in earnest over the past couple years, and
> would have loved to find an active, open community of PDF remediation
> professionals.
>
> I've given some thought to how such a community might develop, and
> Github isn't the first place that comes to mind - it seems like an
> unintuitive place to locate such a resource. I've used github for years,
> but I'm not even sure how you imagine someone will get started
> contributing - create an issue and then comment on it? Are you looking
> for pull requests to your readme?
>
> If the eventual goal is to create a wiki, then why not create a wiki
> directly? Maybe I just haven't kept up on how different communities are
> using Github as a tool.
>
> Phil.
>
> On 2017-09-09 12:47 PM, Metzessible wrote:
> > Hi there,
> >
> > I got bored last night and started a new repo on my github to provide a
> way
> > for people to help other people make more accessible PDFs. You can find
> it
> > here (in it's beautiful, currently unpopulated state):
> >
> > https://github.com/metzessible/PDFHelpDesk
> >
> > I'm aware there's a lot of information out there on how to make PDFs
> > accessible, but the resources tend to vary on what's important or
> relevant
> > in terms of document accessibility. It also seems like there's
> conflicting
> > information out there from experts on how to handle PDF in the first
> place.
> > I'm also aware that many accessibility experts simply advise doing
> > something else instead. While that's great and all, PDFs are still being
> > created resulting in a lot of terrible documents out there.
> >
> > Further, now that the ISO has released ISO 32000-2 (
> > https://www.pdfa.org/publication/iso-32000-2-pdf-2-0/), the methods
> that we
> > use to tag PDFs are going to change eventually. This is bound to cause
> more
> > confusion, frustration, and gnashing of teeth. My hope is to create a
> > helpful resource for people who perhaps help create a consensus for what
> an
> > accessible PDF is supposed to look like. It'll also be an inevitable
> place
> > for me to post examples of things I come across in the wild and how to go
> > about fixing them.
> >
> > Hopefully this will become a decent place for those stuck remediating
> > documents with little to no guidance, since ignoring it for so long
> hasn't
> > really made the problem of inaccessible PDFs go away. In any case,
> > hopefully I'll figure out how to actually use Github a bit better to make
> > the most of it. I'm open to suggestions, and apologies in advance for the
> > snark in the readme.
> >
> > Cheers,
> > Jon Metz
> > > > > > > > >
> > > > >
>