XML Linking

Thursday, August 23, 2007

The Web Consortium’s XML Linking working group is developing specifications to enable more advanced hypertext functionality on the Web: in particular fine-grained anchors, external annotation, and bi-directional links. This paper examines basic goals and approaches; describes HTML linking limitations XML Linking seeks to overcome; and surveys the Working Group’s primary specifications: XPath, XPointer, and XLink. As of this writing, the last two, while well advanced, are not final recommendations, and so are subject to change. Consult the W3C Web site for the latest versions.

Background of XML Linking

The HTML tag set and its hypertext element types such as A are very useful and popular, yet have difficulties apparent at larger scale and for more diverse data. These include practical matters such as divergence of implementations and mixing of formatting with structure (I, TT, HR versus H1, OL, BODY); but also more fundamental limitations:

  • Element types only of certain kinds, which do not model novel kinds of information (say, PRICE for a mail-order catalog).
  • Links only of particular kinds (coarse-grained, inline, one-way, and bound up with specific behaviors).

Information modelling

XML permits creating new element types, and trees of them. These can model information structures, moving markup beyond the realm of formatting [Coombs 1987]. This benefits many aspects of document development including hypertext, since documents as structures are processable for more purposes than formatting or printing: retrieval, linguistic and thematic analysis, database interchange, querying, etc. (a list of purposes is less important than the notion of arbitrary processing — [DeRose 1990]). Ordered hierarchies are particularly adept at modelling largely linguistic objects such as documents.

“Generalized”, “generic”, or “descriptive” markup has been discovered several times, apparently independently. Scribe [Reid 1981] is an early formatter based on structure rather than formatting commands. IBM’s GML fed into the SGML standard [ISO 1986]. SGML achieved widespread adoption in high-tech industries and introduced document grammars or schemas (called “Document Type Definitions” or “DTDs”) — HTML, for example, is defined by a DTD. See [Reid 1981] and [Furuta 1992] for more information.

XML [Bray 1998] is a meta-language like SGML, not a specific tag language like HTML. It shares SGML’s abstract model while removing abbreviatory features and simplifying parsing. A comparative analysis is [DeRose 1997]. Documents are ordered hierarchies of typed “element” nodes, each type supporting particular named “attributes” (in addition to its hierarchical “content”). Character data and references to non-XML data objects reside at leaves. Intra-document links can use an ID attribute type. The W3C “XML Information Set” working draft [Cowan 1999] is formalizing the structure, but an approximate example of an XML document structure is:

Read article completely on Brown University Website



 
Indelv.com is for sale!
 
ERP systemen
Alle ERP-systemen op een rij, compleet met ERP-nieuws en ERP-software informatie.
www.ERPcentraal.nl
ERP systemen
Alle ERP-systemen op een rij.
www.erpmatrix.nl


Quick Links
Our Friends
Cool Places
Visit also
About Us