Skip to content

Jotting down ideas for the discussions @ TPAC F2F on ?? Web Publications

Ivan Herman edited this page Sep 20, 2016 · 2 revisions

Intro

Trying to jot down the approach/categorization that was outlined on the 'recap' session of the DPUB IG F2F, on the 19th of September late afternoon. These changes affect both the structure of the UCR document and the work ahead, and (hopefully) this will help in the discussion later today

The organization of the work could be separated as follows:

Web Publications

A working definition of WP could be:

A Web Publication is a collection of one or more Web resources in standard formats, organized together into a uniquely identifiable grouping.

I believe the unique identifier should be a URL (that makes it a Web Publication) although, internally (e.g., as metadata), it may also have an ISBN, a DOI, etc.

Because this is a URL, we will have to define what is returned by a server if that URL is dereferenced. In abstract, one could say something like

The URL should return all information necessary to process, present, display, etc, the Web Publication

There are a number of issues that must be worked out, coming up from the discussions on the UCR issues.

  • Online/offline access: the hows and the where
  • What data should be provided (not necessarily a "must") as part of a Web Publication. Things that did come up:
    • List of the constituent resources, possibly as a hint, that may be necessary for larger publications
    • Table of contents. Note that the same WP may have several renditions, i.e., several possible reading orders, depending on content and user's choice; some may involve sound for "audio books" or text synchronized with other content.
  • Additional metadata should be made available (author, publication dates, etc)
  • What exactly would a server return when a WP reference is dereferenced
  • Relationships to the web manifest spec, to other similar issues that are looked elsewhere (eg, games)

There is also an orthogonal issue that came up, which may be more related to how a WP would be handled on the Web. If, in abstract, we talk about a WP Processor, most probably implemented on top of Service Workers, what is the processing model. Is a WP:

  1. A separate application relying on a browser engine (but with its own chrome)
  2. Some sort of an extension or add-on (I am not sure what exactly is the right term these days, and we have to explore that) relying on a browser, but re-using the browser chrome. Ie, if I have, say, a hypothes.is annotation extension added to my browser, I should be able to use it when consuming a WP
  3. There is no WP, in fact, because the browser does it all.

(My personal feeling is that No. 1 is of course possible but not very interesting (this is what current ebook readers do already, nothing new there). The ideal thing would be No. 3, but that may be considered as a long-term goal; different browser vendors may have different interest and they may decide that a WP is not a "fundamental" feature that should be on the Web. I guess No. 2 is the realistic model that we should have.)

Packaged Web Publications

(Note that I use "Packaged" and not "Portable". It is a general category, not necessarily a specific standard.)

There has been quite a number of use cases, as well as the current publication business model that should not be completely disrupted, that a packaged version of a WP is necessary. But it should/could be separated from the WP, at least conceptually (although a browser extension may decide to include an extra step of packing or unpacking, but that is not defined by any standard).

That being said, the goal of any packaging format should be a 100% compatibility with WP. What this means is:

If a PWP is unpacked, its content should be 100% WP compatible.

It may be that a specific PWP format imposes some additional restrictions on a WP, eg, deciding that specific files must be at specific places within the package. W3C may choose one packaging format, but it should be recognized that it may not cover all different use cases.

(The obvious question is the relationship to EPUB3.1. To make EPUB a genuine PWP, it would be necessary to update it. This may lead to something one could call EPUB4.)