The figure and figcaption elements are 2 of the new elements in HTML5. Together they provide the promise of being able to mark-up, with meaning, the structure and relationship between a piece of content and associated content that acts as a descriptive label. Currently as implemented in browsers the semantics of figure and figcaption are practically non existent.
What the HTML5 specification says
figureelement represents some flow content, optionally with a caption, that is self-contained and is typically referenced as a single unit from the main flow of the document.
The element can thus be used to annotate illustrations, diagrams, photos, code listings, etc, that are referred to from the main content of the document, but that could, without affecting the flow of the document, be moved away from that primary content, e.g. to the side of the page, to dedicated pages, or to an appendix.
figcaptionelement represents a caption or legend for the rest of the contents of the
figureelement, if any.
Current practical meaning conveyed by elements in the example:
All very interesting but what can I as a developer do now?
For the general use cases, until the semantics of
figcaption have been implemented in browsers and AT it is suggested that:
- Use a descriptive word at the start of the
figcaptioncontent to give users an idea of what the content is labelling something, for example “Figure X:” or “Chart Y:”
- Be consistent in your
figcaptionlabelling within and across pages.
- Place the figcaption (in the code) before the content to be labelled so it is announced prior to the content it is labelling.
- For example of use with images refer to HTML5: Techniques for providing useful text alternatives:
The background for these recommendations
How to convey the semantics?
The semantics of figcaption can be conveyed visually by the placement of the figcaption, above or below the content it labels and through its proximity to such content. From observations of how figures etc are currently marked up, in some cases the figure element semantics will not be indicated visually, though it may be indicated as part of the figcaption text and/or by the addition of a border or background color. Such visual indications do not provide much value for users who cannot make use of them. While proximity provides some indication of a semantic relationship it alone does not suffice.
The standard method to convey semantics to assistive technology (AT) is by the use of defined roles and relationships provided by Accessibility API’s. These roles and relationships are typically mapped to HTML elements by the browser and AT accesses the information from the API exposed by the browser. A problem arises with figure and figcaption, because figure does not have a specified role and while figcaption can be mapped to a caption role in some Accessibility API’s others do not provide this role. Element names can be passed through accessibility API properties, but this does not confer a defined accessibility semantic for a given element, thus no common definition of what a particular element is and does is provided, this can and does lead to interoperability issues across browsers and AT. Making it much harder for both users and developers to realize a common user experience across software, devices and platforms.
ARIA to the rescue?
ARIA can help, but does not offer a complete solution:
- It does not include a
- It does not include a
aria-describedbymay be used to associate figcaption content with figure content, but their use does not provide the role semantics to differentiate the
figcaptionsemantics from the standard labelling methods of the title attribute and in the case of images the alt attribute.
In order for ARIA to really help it is suggested that the addition of 2 new roles may be required:
The object contains descriptive information, usually textual, about another user interface element such as a table, chart, or image.
The object is a container for a user interface element such as a table, chart, or image and a caption which labels the element.
Whether the additional roles are needed depends on what will provide the best user experience. Do users want to be made aware of both structures? Should the figcaption content be associated with the figure or the content it contains? Should none , one or both of the structures be voiced by AT? Should the caption always be announced prior to the figure content or after or depend on the caption placement (before/after)?
The following scenarios are also available on a test page which has the role information included inline to simulate what would be available to the AT user for each scenario.
The presence of both figure and caption are announced, the figure start and end are voiced. the caption is announced before the content. (Simulates the figure being labelled by the figcaption)
The presence of figure but not caption is announced, the figure start and end are voiced. the caption content is announced before the figure content. (Simulates the figure being labelled by the figcaption)
The presence of caption but not figure is announced, the caption content is announced before the figure content. (Simulates the figure content being labelled by the figcaption)
The presence of caption but not figure is announced, the caption content is announced after the figure content. (Simulates the figure being labelled by the figcaption)
The presence of caption or figure is not announced, the caption content is announced dependent on the placement in the code (before/after)
Note: Scenario 5 is what users currently experience.
Code example for all scenarios
AT output example Scenario 1
AT output example Scenario 2
AT output example Scenario 3
AT output example Scenario 4
AT output example Scenario 5
What do users want?
I have coded a test page with examples, from the scenarios above, simulating what information could be announced and ordered, please give it a try in your favourite AT and provide comments.
I think the big problem with
Richard: In HTML5, the caption element accepts flow content too – and that is the reason why figcaption allows it as well, one should think.
So, do you see it as a problem that the caption element allows flow content too?
Steve, is it not just a case that these elements are just poorly supported at the moment (due to their inherent ‘newness’) rather than being poorly defined or semantically hollow (as currently defined in the spec??).
I would rather not see any extra markup/elements/attributed added to these structures for them to work if its just a matter of user agent implementation (or lack thereof).
It would be just too complex and messy for authors to start having to add ARIA et to native semantics in order for them to just work.
Steve’s idea is a new default role, I think. Thus you will not – when support is there – have to add it.
Hi Josh, It is not intended that once implemented in browsers/AT the ARIA role will have to be added by authors. ARIA is a method to add additional semantics that are not provided via current Accessibility APIs. For example, some ARIA roles (e.g. navigation is mapped to the nav element) are mapped to new HTML5 elements in Firefox. So authors don’t have to add them.
Initial user feedback indicates that scenario 2 is preferred:
thanks Steve, I see. fwiw I would just rather see this supper native right now and I understand that because support in Uas is lacking you come up with inventive solutions. but I think its the thin end of the wedge. for example, screen readers like Jaws should be supporting html5 today as we have passed LC1 IMO rather than going down this route IMO.
supper should be support btw dyac
Problem is that JAWS etc are not being provided with the accessibility information from the browsers via the acc APis. Not having a common defined way to represent stuff to AT is a problem. There is no native way to convey the figure/figcaption semantics in a meaningful way as the required semantics are not available yet.
How would new ARIA roles bring accessibility support sooner? (I would like to understand this better, thanks)
it’s not just about sooner it’s about having defined semantics for the new elements period. It’s easier to define roles in ARIA and have them implenented by browsers to convey semantics via current interfaces exposed to AT by accessibility APIs, rather than wait for each API to have the required roles added. It also means that authors can use the features as well.
Thanks for clarifying. So defining the roles means:
(a) browser vendors know to do in the accessibility API
(b) authors can specify @role
Both of which AT can work with?