Associating Schemas with XML documents

Working Draft 5 June 2005

This version:
Working Draft: 5 June 2005
Editor:
Makoto MURATA 

Abstract

This document allows schemas in any schema language to be associated with an XML document by including one or more processing instructions with a target of oasis-schema in the document's prolog. The oasis-schema processing instructions are merely hints, which may be ignored by the validator. Schemas referenced by these processing instructions should not be considered as integral parts of the XML document.

Status of this Document

This is a working draft constructed by the editors. It is not an official committee work product and may not reflect the consensus opinion of the committee.

Table of Contents

1. Introduction
2. Syntax
3. Semantics

Appendixes

A. Example
B. Rational for the use of processing instructions
C. Acknowledgements
References

1. Introduction

This document introduces the oasis-schema associating processing instructions as a mechanism for associating schemas possibly written in [RELAX NG] (the XML syntax or compact syntax), [Schematron], [NVDL], among others.

The oasis-schema processing instructions are merely hints. They do not signify that the document must be validated, or has been validated, or that validation will augment the infoset with default values or create a PSVI [W3C XML Schema Structure]. Even when the document is to be validated, the validator MAY ignore some or all of the oasis-schema processing instructions.

Note

[RELAX NG], [Schematron], and [NVDL] do not affect the infoset.

The structure of this document is as follows. Section 2, “Syntax” describes the syntax of the oasis-schema processing instructions. Section 3, “Semantics” describes their semantics.

2. Syntax

Schemas can be associated with an XML [XML 1.0] document by using a processing instruction whose target is oasis-schema.

The oasis-schema processing instruction is parsed in the same way as the xml-stylesheet processing instruction defined in the [StyleSheetPI] Recommendation.

Note

Values of pseudo attributes cannot have character references or entity references with the exception of predefined entities.

The following grammar is given using the same notation as the grammar in the XML Recommendation [XML 1.0]. PseudoAtt in the grammar is defined in the [StyleSheetPI] Recommendation, while S is defined in the XML Recommendation.

[1]SchemaPI::='<?oasis-schema' (S PseudoAtt)* S? '?>' 

The oasis-schema processing instruction is allowed only in the prolog of an XML document. The syntax of XML constrains where processing instructions are allowed in the prolog. The oasis-schema processing instruction is allowed in the document entity, but is not allowed in the external DTD subset or in parameter entities.

The following pseudo attributes are defined

  href CDATA #REQUIRED
  type CDATA #IMPLIED
  mode CDATA #IMPLIED
  option CDATA #IMPLIED

Each oasis-schema processing instruction specifies a schema by the pseudo-attribute href. This pseudo-attribute is mandatory. Permissible values of this attribute are IRIs.

An oasis-schema processing instruction may further specify a media type, a mode, and an option by the pseduo-attributes type, mode, and option, respectively. These pseudo-attributes are optional. Permissible values of the pseudo-attribute type are media types. Permissible values of the pseudo-attribute mode is a string of the simple type NCName of [W3C XML Schema Datatypes]. Permissible values of the pseudo-attribute option is a string.

3. Semantics

The oasis-schema processing instructions are merely hints. The validator may ignore the oasis-schema processing instructions and use different schemas for validation.

Even when the validator uses the oasis-schema processing instructions, it MAY select some of the modes specified by the oasis-schema processing instructions, possibly controled by the user input. The validator MAY choose some of the schemas for the selected modes, possibly controled by the pseudo-attribute type, the media type of each schema, and the namespaces appearing in each schema (when the schema is represented by an XML document). Finally, the validator MAY perform validation against the selected schemas. If a selected schema is referenced by an oasis-schema processing instruction having the pseudo-attribute option, its value is an optional input for validation against this schema. If the schema language is Schematron, the validator SHALL use this value as a phase name. If the schema language is RELAX NG or NVDL, the validator SHALL ignore this value.

Note

For schema languages to use oasis-schema processing instructions, they SHOULD define the semantics of the pseudo-attribute option.

A. Example

  <?xml version="1.0"?>
  <?oasis-schema href="foo.rng" type="application/xml" mode="final"?>
  <?oasis-schema href="baz.rnc" type="application/rnc" mode="draft"?>
  <?oasis-schema href="baz.rng" type="application/xml" mode="draft"?>
  <?oasis-schema href="baz.sch" type="application/xml" option="firstPhase" mode="draft"?>
  <bar/>

The user might want to choose the mode "draft" when the above XML doument is still a draft. Then, the validator MAY valiate this document against baz.rnc (in the RELAX NG compact syntax) and baz.sch (in Schematron) using the phrase "firstPhase", but it MAY ignore using baz.rnc. The user might want to choose the mode "final" when the above XML document is completed. Then, the validator MAY valiate this document against foo.rng.

B. Rational for the use of processing instructions

This document adopts processing instructions rather than attributes. Attributes are considered inappropriate, since the introduction of schema-associating attributes to documents requires changes to schemas for allowing the attributes. Processing instructions have been used by W3C for associating stylesheets with documents and have proven to be simple and useful.

C. Acknowledgements

Rick Jelliffe and George Cristian Bina contributed to this document.

References

Normative

[XML 1.0] Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen, Eve Maler, editors. Extensible Markup Language (XML) 1.0 Second Edition. W3C (World Wide Web Consortium), 2000.

[XML Namespaces] Tim Bray, Dave Hollander, and Andrew Layman, editors. Namespaces in XML. W3C (World Wide Web Consortium), 1999.

[RFC 3987] M. Duerst and M. Suignard. RFC 3987: Internationalized Resource Identifiers (IRIs). IETF (Internet Engineering Task Force). 2005.

[W3C XML Schema Structure] , editors. XML Schema Part 2: Datatypes. W3C (World Wide Web Consortium), 2001.

[W3C XML Schema Datatypes] Paul V. Biron, Ashok Malhotra, editors. XML Schema Part 2: Datatypes. W3C (World Wide Web Consortium), 2001.

[RELAX NG] James Clark, Makoto MURATA, editors. RELAX NG Specification. OASIS, 2001.

[ISO/IEC RELAX NG] ISO/IEC 19757-2, Information technology -- Document Schema Definition Language (DSDL) -- Part 2: Regular-grammar-based validation -- RELAX NG. 2003.

[StyleSheetPI] James Clark. Associating Style Sheets with XML documents Version 1.0. W3C (World Wide Web Consortium), 1999.

Nnon-normative

[Schematron] ISO/IEC FCD 19757-3, Information technology -- Document Schema Definition Language (DSDL) -- Part 3: Rule-based validation - Schematron. 2005.

[NVDL] ISO/IEC FDIS 19757-4, Information technology -- Document Schema Definition Language (DSDL) -- Part 4: Namespace-based validation dispatching language -- NVDL. 2005.