Comparison of Different Schema Languages
Different schema languages have different strengths and weaknesses. It is sometimes a good
idea to create schemas in different languages for the same xml domain. If and when this is
a good idea depends on the details of the xml domain and the tools used when working with
the associated instance documents.
Document Type Definition (DTD)
The original schema language for xml, it has the following advantages:
-
The basic structure is easily understood and authored.
-
There are advanced features that make DTD's surprisingly powerful and expressive.
-
The DTD language is built-in to XML itself and so it is a necessary technology. One
example of this is the need to use DTD when defining ENTITY's such as character escape
sequences.
DTD has the following disadvantages:
-
The syntax isn't xml so using DTD requires an independent parser.
-
There is only very limited support for namespaces.
-
There is no support for datatypes beyond strings and enumerations.
-
The structure is document-centric, so there is no inherent support for
protocols.
XML Schema Definition Language (XSDL)
Created and maintained by the W3C organization, XSDL is perhaps the most
widely used schema language today. It has the following advantages:
-
The syntax is XML.
-
There is extremely good support for datatypes.
-
It is namespace aware.
-
It treats elements and attributes as similarly as is possible. This permits rich
structural constraints on groups of attributes.
-
There are a huge number of tools available for working with XSDL.
-
It provides a pseudo-OOP approach to defining structure, in that structures can
be defined and reused via a referencing syntax.
XSDL has the following disadvantages:
-
It is structurally complex, making the raw XML difficult to write and difficult to
read.
-
The namespace support is limited. Normally an XSDL schema will define structures
within a single namespace. When multiple namespaces are involved, the schema author
needs to use document linking artifacts.
-
The structure is document-centric, so there is no inherent support for
protocols.
-
The agnostic treatment of elements vs attributes is also a weakness. It can seduce the
XML designer into over-use of attributes. There are strong arguments that can be made
that document data should be held entirely in element values, with attributes limited
to meta-data.
Skeleton Schema Language
The skeleton schema has the following advantages:
-
The syntax is XML.
-
Schemas are easily written and understood. This is the principal design goal
for the skeleton schema language. This is especially true when an XML design
is based on an element-centric approach. It is fortunate that the element-centric
approach is always the correct approach.
-
Very good support for namespaces. Since a schema looks like the documents that
it describes, the usage of namespaces in the schema echos the namespace usage
in the instance documents.
-
There is support for protocols. This is the second most important advantage
of the skeleton schema language.
-
Can support all datatypes that are also supportable in the XSDL because the
same syntax is re-used. Beyond this, there are a set of inherent datatypes
that provide very useful extensions to the core XSDL types.
-
The support for attributes is limited to the support that XML itself provides. This
helps to constrain the XML designer from over-usage of attributes.
The skeleton schema has the following disadvantages:
-
There is currently NO tool support for skeleton schemas. This will change
over time, but it is extremely unlikely that there will ever be the rich support
currently available for XSDL schemas.
-
The limited support for attributes means that the syntax for defining attributes
is awkward. This is a weakness in that there are a lot of XML domains that have
been designed with the attribute-centric approach. It would be nice to communicate
these designs to human readers using skeleton schemas, but the result appears ugly
and can give an initially poor impression of the skeleton schema approach.