Interactions between DSDL Part 1 and DSDL Part 4

March 31, 2003

MURATA Makoto (International University of Japan)

1. Introduction

Parts 1 and 4 are closely related. This note is an attempt to clarify their relationships and study study possible approaches for the integration of these parts.

2. Updates on Part 4

1) Expected impacts

I believe that Part 4 provides a good chance to sell DSDL to W3C and then to the rest of the world. Part 4 or MNS has already attracted some attention from W3C, since MNS allows easy integration of XHTML2 and other XML-based languages (including RDF, MathML, SVG, and XForms). A very interesting example of MNS, which combines XHTML2, MathML, SVG, EGIX, ContactXML, HLink, RDF, and OASIS XMLCharEnt is available at:
http://www.w3.org/People/mimasa/test/xhtml2/hybrid.xhtml

2) MNS

Although the CD of DSDL Part 4 is simple, it is simple only because it lacks important features. MNS introduces some powerful features for combining schemas. They are pruning, modes, and contexts, lax processing, attribute validation, and covered namespaces.

Most of these features are very useful for combining schemas and they are quite heavily used by the previous example. However, they impose some challenges. First, MNS is now more than just dividing documents into validation candidates. An ISO-style specification for MNS would probably require thirty pages. Second, we have to carefully consider which of the MNS features should belong to which part of DSDL. For example, one could argue that pruning should be moved to Part 1.

3. Desiderata

Why do we create DSDL as a multi-part standard rather than a monolithic standard? There are two reasons. First, we would like to divide and conquer the complexity by schemas and validation. Second and more importantly, we would like to allow and even encourage users to use only those parts which they really need. By doing so, we can make validators lighter, faster, feasible and portable. Therefore, I propose desiderata as follows:

Here is another reason for Desiderata 3. To sell DSDL to W3C, we have to develop Part 4 as soon as possible. If we drop this desiderata, we miss a big chance to sell DSDL to W3C.

4. Possible approaches

How should Part 1 and Part 4 interact? There are two obvious approaches, which have been used by existing proposals.

Schemachine and outie use Approach 1. However, they do not have MNS features such as pruning, modes, and contexts, lax processing, attribute validation, and covered namespaces. I believe that Part 1 will become too complicated if the primitive for dividing documents has all these features. Thus, Approach 1 fails to satisfy Desiderata 1. Moreover, since this approach introduces all Part 4 features as a Part 1 primitive, it also fails to satisfy Desiderata 2 and 3.

Another approach was taken by RELAX Namespace.

However, RELAX Namespace provides no mechanisms for invoking other parts (such as simple transformation) of DSDL. If we introduce Part 1 features to MNS, none of the three desiderata will be satisfied.

I would like to propose another approach.

When we want to apply simple transformation to an XML instance and then apply namespace-wise validation, we invoke Part 4 from Part 1. Part 4 typically invokes RELAX NG (Part 2). Part 1 invokes Part 8 for simple transformation.

When we want to apply simple transformation after we divide a document into validation candidates, we invoke Part 1 from Part 4. Part 1 in turn invokes Part 4 for the required transformation and then invokes Part 2 for validation.

One could argue that Approach 3 complicates the layering of parts of DSDL. However, as long as inputs and outputs of Part 1 validation and Part 4 validation are well defined, I do not think we have any problems.

5. Design of Part 1 and Part 4

First, it should be possible for Part 1 schemas to reference to Part 4 schemas (or directly contain them). Second, it should be possible for Part 4 schemas to reference to Part 1 schemas (or directly contain them).

Note: I first thought that some primitives of Part 1 and Part 6 should be directly usable from Part 4. However, I now think that we should always channel through Part 1 schemas.