One of the limitations of the core HTML markup grammar is that it is not well-suited for defining rich data
structures because of its small set of elements. There may be hundreds of aside elements
in a publication, for example, but reliably distinguishing which ones represent notes from sidebars from
warnings and alerts has not been possible.
For sighted readers, the deficiency this causes has been masked by the enhanced visual rendering that CSS style sheets afford (backgrounds, borders and shading are used to convey roles visually). For readers using assistive technologies — which rely on an understanding of the underlying markup in order to facilitate navigation — Web-based technologies, like EPUB, have only had limited accessibility because primary and secondary material was often indistinguishable below the visual surface.
To make ebooks more accessible, you need to consider that many readers will be interacting with the content
in non-visual ways, and for that reason the logical reading order must be defined at the
markup level. To facilitate this discovery, EPUB 3 includes a new epub:type attribute that allows
more precise meanings to be applied to the generic tags, a process called
semantic
inflection.
Although critical for accessible navigation, creating semantically-rich data has benefits for all readers. The enabling of specialized behaviors, such as the opening of footnotes, is directly predicated on content being properly identified. Rich data also future-proofs content, both by identifying the original authoring intent, in cases where it may be ambiguous, and by making it simpler to archive and reprocess.
Note
Semantic inflection can only be used to define the nature of structural markup. It is not defined for making associations between your content, a process called semantic enrichment. See the faq for more about the availability of semantic enrichment mechanisms in EPUB.
The epub:type attribute can be attached to any element in the body of a document, and it
accepts any of the terms defined in the
EPUB Structural Semantics Vocabulary
by default.
For example, the section containing the dedication for the work could be identified as follows:
<section epub:type="dedication">
…
<section>
The dedication value used in the above example is not just a random
string, but is a predictable value that reading systems can expect to encounter across publications.
Although, in theory, any semantic could be applied to any element, only certain semantics make sense
to use on any given tag. Marking an aside element as a footnote is appropriate,
for example, but marking a section as a footnote not so much. The Structural Semantics
Vocabulary lists the common element(s) each semantic is intended to be used in conjunction with to
facilitate this process (although exceptions to the rule may arise).
You are not limited to making only one statement in the epub:type
attribute, either. You could, for example, explicitly note whether a dedication
falls in the front or a back matter by including a second space-delimited semantic:
<section epub:type="dedication backmatter">
…
<section>
Note that the order of the semantics is not important to their processing. Including more than one semantic can affect styling, however. The following CSS rule to match tables of contents:
section[epub|type='dedication'] {
…
}
would not match the second example, only the first. When using attribute selectors
in CSS, you must account for space-separated values by using the ~= notation.
The following CSS declaration would match dedication in both
the preceding markup examples:
section[epub|type~='dedication'] {
…
}
Once a semantic has been defined, the nature of the containing element influences all content
defined in it. For example, although the previous example attached the
backmatter semantic to the element containing
the dedication, all the back matter sections could be grouped into a parent
backmatter section as follows:
<section epub:type="backmatter">
<section epub:type="dedication">
…
</section>
<section epub:type="index">
…
</section>
…
<section>
Since front/body/back matter is more of an ephemeral context in which content is used than
an actual section of content, a better approach is to include this information on the body
tag:
<body epub:type="backmatter">
<section epub:type="dedication">
…
</section>
<section epub:type="index">
…
</section>
…
<body>
When processing elements based on their semantics, applications typically will check the entire ancestor chain to determine the applicable relationships.
epub namespaceWhen using the epub:type attribute in a content document, the epub
namespace must be declared on the element containing the attribute, or on one of its ancestors. The namespace
is typically declared once on the root html element, as in the following example:
<html …
xmlns:epub="http://www.idpf.org/2007/ops">
…
<dl epub:type="glossary">
…
</dl>
…
</html>
The epub:type attribute is not limited to values defined in the
EPUB Structural Semantics Vocabulary.
Additional terms may be used, whether defined in an RDF vocabulary or not, so long as a
unique prefix has been defined in the epub:prefix attribute for them. To use
terms from the more expansive
Z39.98 Structural Semantics
Vocabulary, for example, the prefix z3998 could be defined as follows:
<html …
xmlns:epub="http://www.idpf.org/2007/ops"
epub:prefix="z3998: http://www.daisy.org/z3998/2012/vocab/structure/#">
…
<section epub:type="frontmatter z3998:published-works">
…
</section>
…
</html>
The URI associated with a prefix is currently only a unique identifier string; it does not have to resolve to a document. It is recommended that only terms from industry-standard vocabularies and controlled list be used, however, since reading system support for arbitrary values is unlikely (but there is no reason to strip semantics from an internal workflow, for example).
Note
Although the EPUB specification reserves the option to define prefixes for industry-standard vocabularies and controlled lists, none are reserved for content documents at this time.
<body epub:type="cover">
<img class="cover-img" src="cover.jpg" alt="Cover Image"/>
</body>
<section epub:type="preface">
…
</section>
<section epub:type="foreword">
…
</section>
<section epub:type="part">
…
</section>
<section epub:type="chapter">
…
</section>
<p>lorum ipsum.<a epub:type="noteref" href="#fn01">1</a></p>
<aside id="fn01" epub:type="footnote">
…
</aside>
<p>lorum ipsum.<a epub:type="noteref" href="#en01">1</a></p>
<aside id="en01" epub:type="rearnote">
…
</aside>
<section epub:type="rearnotes">
<h1>Endnotes</h1>
<section>
<h2>Chapter 1</h2>
<aside id="c01-en01" epub:type="rearnote">
…
</aside>
…
</section>
…
</section>
<p>
…ipsum.<a epub:type="annoref" href="#a01">1</a>
</p>
<aside id="a01" epub:type="annotation">
…
</aside>
<aside epub:type="sidebar">
<h3>Killer Bee Migration</h3>
…
</aside>
<dl epub:type="glossary">
…
</dl>
<section epub:type="bibliography">
…
</section>
<section epub:type="index">
…
</section>
epub:type attributeWhether one, both or neither of these methods of semantic enrichment would prevail in HTML5
remained to be seen at the time of the EPUB revision. The EPUB WG chose not to wade into
the issue, but instead provide more basic semantic inflection through the epub:type
attribute, using a model similar to the W3C
role attribute.
As both of these technologies have been endorsed by the W3C, it is expected that they will be made available for use in EPUBs in a coming revision.
Creating meaningful class names for your CSS is certainly encouraged, but reading systems are
neither required nor expected to do anything with the class attribute as far
as semantic processing goes.
Microformats, more generally, are not recommended as they blur the line between content authoring (and styling) and semantic inflection, and appropriate elements and attributes for non-standard uses. This latter use creates problems for accessible processing and rendering of content.