E-mail List Archives

You are here: Home > Community > E-mail List Archives > View Thread

Number of posts in this thread: 12 (In chronological order)

From: Duff Johnson
Date: Jun 10, 2013 9:23PM
Subject: Proposed: a TN tag to join TH and TD?
No previous message | Next message →

Some of us who are thinking about PDF 2.0 are thinking about tables.

Specifically, we note that many tables include cells that are (usually) empty, but also serve no real purpose in the table except to keep the rectangular arrangement of cells intact.

Yes, simpler tables are best for accessibility purposes, but like them or not, complex tables are pretty unavoidable. "Dead" cells happen in tables; it's a fact of life, but what's the right way to recognize this fact? I know that in practice there are lots of empty TH and TD cells out there - but that's not necessarily ideal.

In the current PDF 1.7, there's no consistent markup for this use case.

One solution we are thinking about is a new cell type to join TH and TD. We're thinking it's called TN, for "no-op".

The most common use of TN would be at the empty "corner" of a table (a very common case) or (less commonly) in the middle of a TH row or column, or where one might otherwise expect a <TD> cells filling a gap. TN differs from an empty TD because TN, by definition, never contains anything of significance.

TN cells have no table function except to fill gaps between other cells. An AT processor wouldn't be expected to report on or inquire within a TN tag. If queried, the cell would report as "intentionally unused" or equivalent.

(There is some question as to whether the use of TN cells in TH rows or columns may create ambiguity in the table structure, and force the use of explicitly linked headers).

A TN cell should not contain semantically relevant content. Options that are OK are:

- visual indicators that the TN ios not just an empty TD cell but rather a no-op cell
- text amounting to "intentionally unused"
- similar stuff

My question what's the "right" way to handle such cells from the HTML point of view? How has the discussion on this question (if any) proceeded in the world of HTML accessibility? What's the right solution, from your point of view, for this common case?

Or perhaps it doesn't matter - maybe empty TH and TD cells don't cause any real problems?

Thanks,

Duff Johnson

Independent Consultant
ISO 32000 Intl. Project Co-Leader, US Chairman
ISO 14289 US Chairman
AIIM Standards Board Chairman & member, AIIM Board of Directors
PDF Association Vice-Chairman

p +1.617.283.4226
e = EMAIL ADDRESS REMOVED =
w http://duff-johnson.com

From: Chagnon | PubCom
Date: Jun 10, 2013 9:46PM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | Next message →

Interesting idea, Duff.
But what's "no-op" mean?

Just to clarify, you're describing cells that only exist to keep the table's
column/row structure intact, not to contain data.

You're not talking about cells that:
- have a value of 0 for the data.
- data for that cell isn't available.
- data is n/a, not appropriate for that row/column.

-Bevi Chagnon
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - -
www.PubCom.com - Trainers, Consultants, Designers, Developers.
Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
Accessibility.
New schedule for classes and workshops coming in 2013.

From: Jukka K. Korpela
Date: Jun 11, 2013 1:47AM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | Next message →

2013-06-11 6:23, Duff Johnson wrote:

> "Dead" cells
> happen in tables; it's a fact of life, but what's the right way to
> recognize this fact?

It depends on why they are dead.

> In the current PDF 1.7, there's no consistent markup for this use
> case.

The same applies to HTML.

> One solution we are thinking about is a new cell type to join TH and
> TD. We're thinking it's called TN, for "no-op".

You would need to define the meaning differently, in more structural, or
"semantic" terms. Moreover, it's not clear that a new tag would be the
best approach, especially since we don't want old software to choke with
a document just because we have improved its tagging. In HTML at least,
I would strongly recommend proposing a new attribute rather than a new
element. This would let browsers keep doing what they are doing now,
just ignoring the attribute. In HTML5, it could be a "boolean" attribute
(a pure keyword attribute), e.g. <td dummy> or <th dummy>.

> The most common use of TN would be at the empty "corner" of a table
> (a very common case) or (less commonly) in the middle of a TH row or
> column, or where one might otherwise expect a <TD> cells filling a
> gap.

The first case is clear and can easily be explained and defined. There
would be a cell in a table such that other cells in the same row are
column headers and other cells in the column are row headers, leaving no
choice for this cell to be anything but a filler, required to make the
table regular.

On the other hand, in HTML, such a situation could be recognized from
other markup if scope attributes are used for those other cells.

But what other cases would there be? I would propose starting with the
hypothesis that all other cases can be handled as normal cells with some
logical content that may physically be displayed and spoken as blank but
isn't really comparable to the dummy cells. On my page about empty cells
in HTML, I suggest various ways to make cells not empty in data content
even though they lack normal data:
http://www.cs.tut.fi/~jkorpela/html/emptycells.html

A common example is to use "N/A" or maybe a dash to indicate that some
data is not available or not applicable. Such cases are different from
the dummy cells, because it is about the data, not the table structure.

Consider a simple example where a table is a matrix that shows distances
between cities, with the same cities appearing as column headers and as
cell headers. Then the upper left corner is really a "dummy" cell. The
diagonal elements that indicate distance from a city to itself are
really "not applicable" and could be left blank (or, better, gray, or
with a dash as the content), but they are not dummy in the table
structure - rather, they indicate trivial data and could logically
contain 0. Similarly, the cells between the diagonal could be left
empty, since they contain redundant information (assuming metrics where
distance from A to B equals distance from B to A), but this too is a
feature of the data, not something that should be reflected in table markup.

So I would say that as a rule of thumb, when a table cell would be empty
due to lack of data or because data is not relevant, it should be marked
up as a normal cell, and we should consider whether some character data
should still appear in it. And really dummy cells are probably a special
case that can be handled with existing markup or that does not need any
particular handling.

Yucca

From: Duff Johnson
Date: Jun 11, 2013 7:39AM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | Next message →

Thanks for the input! Very interesting

>> One solution we are thinking about is a new cell type to join TH and
>> TD. We're thinking it's called TN, for "no-op".
>
> You would need to define the meaning differently, in more structural, or
> "semantic" terms.

A TN cell is to be represented as "Intentionally blank".

> Moreover, it's not clear that a new tag would be the
> best approach, especially since we don't want old software to choke with
> a document just because we have improved its tagging.

A very valid point and yet, we're trying to look (for now) at "how should it be done" rather than simply caving to the dead weight of old software. ;)

> In HTML at least,
> I would strongly recommend proposing a new attribute rather than a new
> element. This would let browsers keep doing what they are doing now,
> just ignoring the attribute. In HTML5, it could be a "boolean" attribute
> (a pure keyword attribute), e.g. <td dummy> or <th dummy>.

I have previously made this suggestion myself in internal discussions on this point. The objection was that unlike TH and TD, TN never, on principle, contains real content, and so genuinely isn't the same sort of animal as TH or TD.

>> The most common use of TN would be at the empty "corner" of a table
>> (a very common case) or (less commonly) in the middle of a TH row or
>> column, or where one might otherwise expect a <TD> cells filling a
>> gap.
>
> The first case is clear and can easily be explained and defined. There
> would be a cell in a table such that other cells in the same row are
> column headers and other cells in the column are row headers, leaving no
> choice for this cell to be anything but a filler, required to make the
> table regular.

Precisely so.

> On the other hand, in HTML, such a situation could be recognized from
> other markup if scope attributes are used for those other cells.

Well, that's sort of my question, because it seems likely to me that, in point of fact, scope doesn't help.

In the attached screen-shot a table is indicated in which a single cell in the leftmost column of row-headers was "intentionally left blank." I'm not sure how to treat this with existing markup and remain unambiguous.

> But what other cases would there be? I would propose starting with the
> hypothesis that all other cases can be handled as normal cells with some
> logical content that may physically be displayed and spoken as blank but
> isn't really comparable to the dummy cells. On my page about empty cells
> in HTML, I suggest various ways to make cells not empty in data content
> even though they lack normal data:
> http://www.cs.tut.fi/~jkorpela/html/emptycells.html

THanks for the link - excellent article!

Yes, the "CSS way" (from your article) is interesting in this regard. Of course, in PDF we have no CSS...
>
> So I would say that as a rule of thumb, when a table cell would be empty
> due to lack of data or because data is not relevant, it should be marked
> up as a normal cell, and we should consider whether some character data
> should still appear in it.

Agreed - that's a distinct use-case - not the one I was positing.

> And really dummy cells are probably a special
> case that can be handled with existing markup or that does not need any
> particular handling.

Well, that's what I'm wondering.

- Yes, they are a "special" case, but also a common case.
- It seems they need markup, otherwise (in the case of TH rows and columns) users must be left wondering which TH cells are headers for the TDs that have (nominally) a "dummy" TH as their row/column headers.

It's more likely, of course, that I'm either missing something or making a mountain from a mole-hill...

Duff.

From: Duff Johnson
Date: Jun 11, 2013 7:57AM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | Next message →

My apologies - I think the listserv kills attachments. Here's the screen-shot I sent in the previous mail on this subject:

http://duff-johnson.com/wp-content/uploads/2013/06/TN-cell-example.png

> Interesting idea, Duff.
> But what's "no-op" mean?

Sorry - blame the developers for that one!

"no-op" = no operation, or "intentionally blank".

> Just to clarify, you're describing cells that only exist to keep the table's
> column/row structure intact, not to contain data.

Exactly so.

> You're not talking about cells that:
> - have a value of 0 for the data.
> - data for that cell isn't available.

Correct, these cases would "not* be TN cases.

> - data is n/a, not appropriate for that row/column.

This one - as described - is more ambiguous. It might well be TN (see the example I sent earlier).

TN is a cell type, not whole rows/columns. Of course, you could have a row of TN cells - uninteresting from a semantic point of view, so "intentionally left blank" seems appropriate.

Duff.

From: Keith Parks
Date: Jun 11, 2013 4:48PM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | Next message →

On Jun 11, 2013, at 6:57 AM, Duff Johnson wrote:

> My apologies - I think the listserv kills attachments. Here's the screen-shot I sent in the previous mail on this subject:
>
> http://duff-johnson.com/wp-content/uploads/2013/06/TN-cell-example.png

In your example, I would argue that the table is a layout table, not a data table.

PDFs may not have CSS, but there must be some sort of structural language involved (tabs, background blocks) that could build this sort of layout without a table.

I believe I understand the issue you are raising, but maybe you need a better example?

******************************
Keith Parks
Graphic Designer/Web Designer
Student Affairs Communications Services
San Diego State University
San Diego, CA 92182-7444
(619) 594-1046
mailto: = EMAIL ADDRESS REMOVED =
http://www.sa.sdsu.edu/communications

http://kparks.deviantart.com/gallery
----------------------------------------------------------

Putting the "no" in "Innovation" since 1988.

From: Olaf Drümmer
Date: Jun 11, 2013 4:59PM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | Next message →

> In your example, I would argue that the table is a layout table, not a data table.

So because it's possibly a layout table you plan to convince just about everyone in the world creating tables they shall not build such tables?

Good luck.

And are you also saying that the presence of such an empty cell will make the document more or less inaccessible to people with disabilities?

What is it that a person with a disability couldn't cope with the presence of such empty cells, if everybody else can?

Olaf

On 12 Jun 2013, at 00:48, Keith Parks wrote:

>
> On Jun 11, 2013, at 6:57 AM, Duff Johnson wrote:
>
>> My apologies - I think the listserv kills attachments. Here's the screen-shot I sent in the previous mail on this subject:
>>
>> http://duff-johnson.com/wp-content/uploads/2013/06/TN-cell-example.png
>
> In your example, I would argue that the table is a layout table, not a data table.
>
> PDFs may not have CSS, but there must be some sort of structural language involved (tabs, background blocks) that could build this sort of layout without a table.
>
> I believe I understand the issue you are raising, but maybe you need a better example?
>
>
> ******************************
> Keith Parks
> Graphic Designer/Web Designer
> Student Affairs Communications Services
> San Diego State University
> San Diego, CA 92182-7444
> (619) 594-1046
> mailto: = EMAIL ADDRESS REMOVED =
> http://www.sa.sdsu.edu/communications
>
> http://kparks.deviantart.com/gallery
> ----------------------------------------------------------
>
> Putting the "no" in "Innovation" since 1988.
>
> > >

From: Duff Johnson
Date: Jun 11, 2013 5:05PM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | Next message →

>> My apologies - I think the listserv kills attachments. Here's the screen-shot I sent in the previous mail on this subject:
>>
>> http://duff-johnson.com/wp-content/uploads/2013/06/TN-cell-example.png
>
> In your example, I would argue that the table is a layout table, not a data table.

It's a data table - there are columns (for session tracks) and rows (for time-slots). I used a small slice for the screenshot.

> PDFs may not have CSS, but there must be some sort of structural language involved (tabs, background blocks) that could build this sort of layout without a table.

This table was created in InDesign using InDesign's table tool. It's a typical situation for a table and that's sort of the point. This is the real world

> I believe I understand the issue you are raising, but maybe you need a better example?

Ok

Page two of this PDF it a table. It includes two cells in the first column that would qualify as "TN" cells under this suggestion.

http://www.pdfa.org/wp-content/uploads/2013/04/Conference-EU-2013-flyer-final-2013-04-17-rev2.pdf

NOTE: I'm not advocating or defending the way this specific table is tagged - that's not (my) question here. Rather, I'm asking: what's the "right" way, notionally, to deal with this in HTML, if there is one, and if not, to ask: why not?

Duff.

From: Jukka K. Korpela
Date: Jun 12, 2013 3:02PM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | Next message →

2013-06-11 16:57, Duff Johnson wrote:

> Here's the screen-shot I sent in the previous mail on this subject:
>
> http://duff-johnson.com/wp-content/uploads/2013/06/TN-cell-example.png

It has a meeting schedule, with starting times in one column, topics in
another. A fairly simple normal table - except that it has a row
containing the name of a track. So it acts as a heading for some rows
after it. And there is no real content for the first cell of that row,
though one might conceivably put the redundant starting time there.

Now that I come to think of it, tables often contain such headings.
Often people use the colspan attribute to make the header span all
columns, avoiding the issue discussed here. But there are situations
where that would not be feasible.

So this would be a candidate for the need for new markup. But I'm nor
sure how important it would be in practice. Anyway, if you want added
markup, I would still recommend trying a new attribute rather than new
element.

> "no-op" = no operation, or "intentionally blank".

The first one is procedural rather than logical or structural, and
"intentionally blank" does not really say why it is blank.

>> Just to clarify, you're describing cells that only exist to keep the table's
>> column/row structure intact, not to contain data.
>
> Exactly so.

A structural definition might be something like the following, assuming
we propose a new attribute:

The boolean attribute dummy in a td element indicates that the cell
exists only to satisfy the structural requirements on a table in HTML.
The content of the cell should be ignored, and it is normally empty.
A <td dummy> element is normally used when other cells in a row
contain row headers for the table as a whole or part thereof.
Common examples include the very first cell of a table where the first
row and the first column otherwise contain column and row headers
as well as a row that acts as a heading for some subsequent rows
so that the heading primarily relates to one column in those rows
and therefore all but one cells in that row are "dummy cells".

Yucca

From: Duff Johnson
Date: Jun 13, 2013 10:43AM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | Next message →

>> Here's the screen-shot I sent in the previous mail on this subject:
>>
>> http://duff-johnson.com/wp-content/uploads/2013/06/TN-cell-example.png
>
> It has a meeting schedule, with starting times in one column, topics in
> another. A fairly simple normal table.

<snip>

> Now that I come to think of it, tables often contain such headings.

Exactly my point - and thus, the significance of the question.

> So this would be a candidate for the need for new markup. But I'm nor
> sure how important it would be in practice.

Well, if these tables are pretty common - and they are - then at least it's "an issue."

The next question is: how "bad" is the problem, in terms of understanding such tables with AT as they may be constructed today (empty THs and TDs)?

> Anyway, if you want added
> markup, I would still recommend trying a new attribute rather than new
> element.

I take this point - and the backwards compatibility argument is powerful as well.

> A structural definition might be something like the following, assuming
> we propose a new attribute:
>
> The boolean attribute dummy in a td element indicates that the cell
> exists only to satisfy the structural requirements on a table in HTML.
> The content of the cell should be ignored, and it is normally empty.
> A <td dummy> element is normally used when other cells in a row
> contain row headers for the table as a whole or part thereof.
> Common examples include the very first cell of a table where the first
> row and the first column otherwise contain column and row headers
> as well as a row that acts as a heading for some subsequent rows
> so that the heading primarily relates to one column in those rows
> and therefore all but one cells in that row are "dummy cells".

Thanks for this - I'll bring it to the appropriate table.

So back to my question what's the history on this subject (if any) in the HTML world?

Duff.

From: Jukka K. Korpela
Date: Jun 13, 2013 2:54PM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | Next message →

2013-06-13 19:43, Duff Johnson wrote:

> So back to my question what's the history on this subject (if any) in the HTML world?

The history of HTML tables as a whole is rather simple. Tables were
first drafted in RFC 1942 "HTML Tables" in 1996, when HTML was very
young. A simplified version was incorporated into HTML 3.2, whereas in
HTML 4, the technical content of RFC 1942 was adopted rather directly.
HTML5 proposes to change many details, but the big picture would not
change much.

I cannot find the issue of "dummy" cells, in the sense discussed here,
as appearing in documents about HTML, and I cannot remember it having
been mentioned in discussions about them. Most often, if an empty cell
is seen as a problem, it is seen as presentational problem, e.g. "how do
I make browsers use background color and borders for an empty cell?"

In RFC 1942, the first example of a table has a dummy cell in the upper
left corner, marked up as TH element with empty content. So it was not
seen as a problem:

<TABLE BORDER>
<CAPTION>A test table with merged cells</CAPTION>
<TR><TH ROWSPAN=2><TH COLSPAN=2>Average
<TH ROWSPAN=2>other<BR>category<TH>Misc
<TR><TH>height<TH>weight
<TR><TH ALIGN=LEFT>males<TD>1.9<TD>0.003
<TR><TH ALIGN=LEFT ROWSPAN=2>females<TD>1.7<TD>0.002
</TABLE>

In a sense, such a dummy cell could be seen as a header cell that has
just been left empty. My point is that it could contain some real header
text, for both the rows and the columns. Sometimes, in contexts other
than HTML, such cells have been divided, with a diagonal, so that one
part contains a header for the cells of the first column, the other part
contains a header for the cells of the first row. In the example above,
the first part could contain e.g. the the "Sex", as it described the
cells "males" and "females"; the other part would be more obscure here.

So in this common special case, the "dummy" cell might be seen as not
dummy at all but a fusion of two headers. There just isn't any way
defined for dealing with it that way. Conceivably, the specification
might be changed to allow <th scope="row column">, but then we would
need some way of writing the two headers into one cell. It would be
possible of course, but I reallt cannot tell what might be a natural way.

Yucca

From: Duff Johnson
Date: Jun 13, 2013 3:08PM
Subject: Re: Proposed: a TN tag to join TH and TD?
← Previous message | No next message

Yucca - fascinating - thanks!

> Conceivably, the specification
> might be changed to allow <th scope="row column">, but then we would
> need some way of writing the two headers into one cell. It would be
> possible of course, but I reallt cannot tell what might be a natural way.

Interestingly (and not to the Dummy case at all, but nonetheless) in PDF 1.7 (ISO 32000-1 14.8.5.7, Table 349) a TH tag may have a Scope of "Row", "Column", or. (drumroll) "Both".

Here's how the Scope attribute is defined in ISO 32000-1:

"A name whose value shall be one of
the following: Row, Column, or Both. This attribute shall only be used
when the structure type of the element is TH. (see Table 337). It shall
reflect whether the header cell applies to the rest of the cells in the row
that contains it, the column that contains it, or both the row and the
column that contain it."

Now embarrassingly, Scope was only added to PDF in 2003. That said, it wouldn't have happened unless it had been implemented...

;)

Duff.