Open Publication Structure (OPS) 2.0

INTERNAL WORKING DRAFT V0.7

December 17, 2006

 

 

 

 

 

 

 

 

 

 

 

TABLE OF CONTENTS

<INSERT TOC Here>


1         Overview

1.1        Purpose and Scope

In order for electronic-book technology to achieve widespread success in the marketplace, Reading Systems must have convenient access to a large number and variety of titles.  The Open Publication Structure (OPS) Specification describes a standard for representing the content of electronic publications. Specifically:

·         The specification is intended to give content providers (e.g. publishers, authors, and others who have content to be displayed) and publication tool providers, minimal and common guidelines that ensure fidelity, accuracy, accessibility, and adequate presentation of electronic content over various Reading Systems.

·         The specification seeks to reflect established content format standards.

·         The goal of this specification is to define a standard means of content description for use by purveyors of electronic books (publishers, agents, authors et al.) allowing such content to be provided to multiple Reading Systems and to insure maximum presentational equivalence across Reading Systems.

Another related specification, the Open Packaging Format (OPF) Specification, defines the mechanism by which the various components of an OPS publication are tied together and provides additional structure and semantics to the electronic publication.  Specifically, OPF:

·         Describes and references all components of the electronic publication (e.g. markup files, images, navigation structures).

·         Provides publication-level metadata.

·         Specifies the linear reading-order of the publication.

·         Provides fallback information for when extensions to OPS are employed.

The OPF specification is separated from this OPS markup specification to modularize the described packaging methodology separate from the described content.  This should help facilitate the use of the packaging technology by other standards bodies (e.g. Daisy) in non-OPS environments.

A third specification, the OEBPS Container Format (OCF) Specification, defines the standard mechanism by which all components of an electronic publication may be packaged together into a single archive for transmission, delivery and archival purposes.

1.2        Definitions

Content Provider

A publisher, author, or other information provider who provides a publication to one or more Reading Systems in the form described in this specification.

Deprecated

A feature that is permitted, but not recommended, by this specification. Such features may be removed in future revisions. Conformant Reading Systems must support deprecated features.

Inline XML Island

An inline XML island is an XML document fragment using a non-preferred vocabulary that exists within a XML document marked-up in a preferred vocabulary within an OPS publication.

OCF

The OEBPS Container Format defines a mechanism by which all components of an OPS Publication may be combined into a single file-system entity.

OEBPS

The Open eBook Publication Structure.  Previous versions of this specification (OPS) and its related specification, OPF, were unified into the single OEBPS specification.  For this version, OEBPS was broken into OPS and OPF to aid modular adoption of the specifications.  OEBPS 1.2 was the highest version of the previous unified specification.

OPF

The Open Packaging Format is the sister-standard to this standard. It defines the mechanism by which all components of a published work conforming to this standard along with metadata, reading order and navigational information are packaged into an OPS Publication.

OPF Package

An XML file that describes an OPS Publication, references all the files used by the publication, provides a publication reading order  and provides descriptive information about the files used.  The OPF Package is sefined by the OPF specification.

OPS

The Open Publication Structure – this standard.

OPS Content Document

An XHTML, DTBook, or Out-Of-Line XML Island that conforms to this specification that may legally appear in an OPF Package spine element.

OPS Core Media Type

A MIME media type that all Reading Systems must support.

OPS Publication

A collection of OPS Content Documents, an OPF Package file, and other files, typically in a variety of media types, including structured text and graphics, that constitute a cohesive unit for publication.

Out-of-Line XML Island

An out-of-line XML Island is an XML document, which exists within an OPS Publication, which is not authored using a preferred vocabulary.  It is an entirely separate, complete, and valid XML document.

Preferred Vocabulary

XML consisting only of OPS-supported XHTML modules and/or DTBook markup.

Reader

A person who reads a publication.

Reading Device

The physical platform (hardware and software) on which publications are rendered.

Reading System

A combination of hardware and/or software that accepts OPS Publications (preferably packaged in an OCF Container) and makes them available to consumers of the content. Great variety is possible in the architecture of Reading Systems. A Reading System may be implemented entirely on one device, or it may be split among several computers. In particular, a Reading Device that is a component of a Reading System need not directly accept OPS Publications, but all Reading Systems must do so. Reading Systems may include additional processing functions, such as compression, indexing, encryption, rights management, and distribution.

XML Document

An XML document is a complete and valid XML document as defined by XML 1.1 standard. (http://www.w3.org/TR/xml11/).

XML Document Fragment

An XML Document Fragment (or document fragment) is defined as an element in an XML Document and all of its content.

XML Island

An Inline XML Island or an Out-of-Line XML Island.

XML Namespaces

XML namespaces (or just namespaces) must conform to the XML Namespaces specification ( http://www.w3.org/TR/xml-names11/).

XPointer

The XML Pointer Framework is a W3C specification that defines a method of pointing into specific locations within an XML document, thus identifying document fragments, as defined in XML Pointer Framework (http://www.w3.org/TR/xptr-framework/).

 

1.3           Relationship to Other Specifications

This specification combines subsets and applications of other specifications. Together, these facilitate the construction, organization, presentation, and unambiguous interchange of electronic documents:

1.       XML 1.1 Extensible Markup Language specification (http://www.w3.org/TR/xml11/); and

2.       XML 1.1 namespace specification ( http://www.w3.org/TR/xml-names11/); and

3.       Document Object Model (Core) Level 1 (http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html); and

4.       XML Pointer Framework (http://www.w3.org/TR/2003/REC-xptr-framework-20030325/); and

5.       XHTML 1.1 Extensible HyperText Markup Language specification (http://www.w3.org/TR/xhtml11/); and

6.       XHTML 1.1 Modularization (http://www.w3.org/TR/xhtml-modularization/); and

7.       Digital Talking Book (DTB) Specification (http://www.niso.org/standards/resources/Z39-86-2005.html); and

8.       SVG 1.1 Specification (http://www.w3.org/TR/SVG11/); and

9.       CSS2 Cascading Style Sheets language (http://www.w3.org/TR/REC-CSS2); and

10.   Unicode Standard, Version 4.0. Reading, Mass.: Addison-Wesley, 2003, as updated from time to time by the publication of new versions. (See  http://www.unicode.org/unicode/standard/versions for the latest version and additional information on versions of the standard and of the Unicode Character Database).; and

11.   Particular MIME media types (http://www.ietf.org/rfc/rfc4288.txt and http://www.iana.org/assignments/media-types/index.html); and

12.   the XML style sheet processing instruction (http://www.w3.org/TR/xml-stylesheet); and

13.   Web Content Accessibility Guidelines 1.0 (http://www.w3.org/TR/WCAG10/); and

14.   RFC 2119: Key words for use in RFCs to Indicate Requirement Levels. (http://www.ietf.org/rfc/rfc2119.txt); and

15.   Synchronized Multimedia Integration Language (SMIL 2.1) (http://www.w3.org/TR/2005/REC-SMIL2-20051213/); and

16.   The OPF specification ([Link here]).

1.3.1          Relationship to XML

OPS is based on XML because of its generality and simplicity, and because XML documents are likely to adapt well to future technologies and uses. XML also provides well-defined rules for the syntax of documents, which decreases the cost to implementers and reduces incompatibility across systems. Further, XML is extensible: it is not tied to any particular set of element types, it supports internationalization, and it encourages document markup that can represent a document’s internal parts more directly, making them amenable to automated formatting and other types of computer processing.

·         Reading Systems must be XML processors as defined in XML 1.1. All OPS Content Documents must be valid XML documents according to their respective schemas.

1.3.2          Relationship to XML Namespaces

Reading Systems must process XML namespaces according to the XML Namespaces Recommendation at http://www.w3.org/TR/xml-names11/.

Namespace prefixes distinguish identical names that are drawn from different XML vocabularies. An XML namespace declaration in an XML document associates a namespace prefix with a unique URI. The prefix can then be employed on element or attribute names in the document. Alternatively, a namespace declaration in an XML document may identify a URI as the default namespace, applicable to elements lacking a namespace prefix. The XML namespace prefix is separated from the suffix element or attribute name by a colon.

Example:
xmlns:oeb=”http://www.idpf.org/2007/ops
 

OPS Content Documents must state their namespace prefix for the OPS namespace in the root element of the document. In addition, the root element of all OPS Content Documents must explicitly specify the namespace of the document. For the XHTML preferred vocabulary, this namespace is “http://www.w3.org/1999/xhtml. For the Daisy Talking Book preferred vocabulary, this namespace is http://www.daisy.org/z3986/2005/.

Example:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:oeb=" http://www.idpf.org/2007/ops">

 

If a document’s root element is not in one of the Preferred Vocabularies as identifed by the media-type attribute of the item element within the OPF manifest element , then the Reading System must assume the document is an Out-Of-Line XML Island [LINK HERE], and process it accordingly.

Example:

<CustomDocumentType xmlns=”http://www.example.com/CustomDocumentType/”>

As OPS has additional functionality and validation requirements beyond the preferred document types and XML islands, there are other namespaces associated with OPS, which are used in specific contexts.

All OPS-compliant documents are assumed to have declared the OPS namespace, which provides functionality for Inline XML Islands. [LINK HERE] If the OPS namespace is used in a document it must be explicitly declared “http://www.idpf.org/2007/ops”. It is recommended that authors bind the ops prefix to that namespace and not use ops as the prefix for other namespaces.

Example:
<html xmlns="http://www.w3.org/1999/xhtml">

1.3.3          XML Namespace Validation

Reading Systems are not required to validate according to XML Namespaces [LINK HERE], as the implementation details for namespace-level validation are unclear and are not supported in a uniform fashion by validation tools.

Reading Systems must validate the existence of the appropriate namespaces, as defined in the Relationship to XML Namespaces section, above.

1.3.4          Relationship to XHTML and DTBook

This specification recognizes the importance of current software tools, legacy data, publication practices, and market conditions, and has therefore incorporated certain XHTML 1.1 Document Type Modules and DTBook as Preferred Vocabularies. This approach allows content providers to exploit current XHTML and DTBook content, tools, and expertise.

To minimize the implementation burden on Reading System implementers (who may be working with devices that have power and display constraints), the Preferred Vocabularies do not include all XHTML 1.1 elements and attributes. Further, the modules selected from the XHTML 1.1 specification were chosen to be consistent with current directions in XHTML.

Any construct deprecated in XHTML 1.1 is either deprecated or omitted from this specification; CSS-based equivalents are provided in most such cases.  Style sheet constructs are also used for new presentational functionality beyond that provided in XHTML.

1.3.5