Configuration, extensibility and namespaces
The whole week I've been digging into a pretty large configuration file and its
schema. It's
Shadowfax (Sfx), which I already introduced in a
previous post. I see some points that allow for improvements,
which mainly have to do with namespaces and extensibility.
Let's recap about what namespaces are for. Here's what the
W3C Namespaces in XML specification says in the motivation section:
We envision applications of Extensible Markup Language (XML) where a single
XML document may contain elements and attributes (here referred to as a "markup
vocabulary") that are defined for and used by multiple software modules. One
motivation for this is modularity; if such a markup vocabulary exists which is
well-understood and for which there is useful software available, it is better
to re-use this markup rather than re-invent it.
Such documents, containing multiple markup vocabularies, pose problems of
recognition and collision. Software modules need to be able to recognize the
tags and attributes which they are designed to process, even in the face of
"collisions" occurring when markup intended for some other software package
uses the same element type or attribute name.
So, namespaces should be used when you expect a document to be extended by
aggregating elements from multiple disparate schemas. This motivation has
to drive the design of the schema, to allow for easy extensibility while
retaining XML-friendliness with regards to the format. The following concrete
points could be improved:
-
Element prefixing: this is just a fragment of the file as it is now:
Clearly, using a prefix is not necessary here. The Sfx namespace is the
only one used in the whole document, so a default unprefixed namespace
could be used on the root of the hierarchy, and the namespace rules of XML
would propagate it to its children. Therefore, the fragment above is
absolutely equivalent from the point of view of XML to the following one:
Consistency of namespaces and their use is also desired across the schemas for
<referenceArchitecture>, <businessActionsDefinition>,
<eventConfiguration>, etc.
-
Attributes with namespace: attributes shouldn't be assigned namespaces. It's
common practice (and the W3C default) to leave attributes without namespaces.
This also makes for more readable files. This default is changed in
the Sfx schema by setting the
attributeFormDefault= "qualified".
What this means is that all attributes in the instance document (the actual
configuration file) must be prefixed, as the attributes are now part of the
targetNamespace:
...
This is pretty cumbersome to read and author, and doesn't really add value
to the extensibilty/usability of the schema and the config file. This may be
a valid (and even necessary) approach for a highly composed document
such as a SOAP message is, where every WS-* spec defines its own attributes and
elements, and almost everything is prefixed. But I wonder if this actually
necessary in a config file... Leaving the default
attributeFormDefault (or omitting it) in the schema, gives
you the following valid instance:
...
I believe this is far better and more familiar. Extensibility isn't hurt, as
the xs:anyAttribute
can still be used, but now you only force attribute prefixing on
extensions, not built-in values, which are the more commonly used. This
brings us to the last point.
-
Schema and configuration extensibility: Sfx is meant to be flexible and
allow a wide range of applications. With this idea in mind, almost everything
is configurable... to an extent. One of the key pieces in this
architecture (and any other SOA-like) is to provide a platform of common
services where your services (let's call them business actions -BA- as in
Sfx) run. I envision that some BAs may need additional
configuration in order to perform their work. I've worked on such an
architecture and BA developers started developing custom configuration
mechanisms for their libraries because the infrastructure didn't provide it,
which led to serious maintenance and deploy problems.
So, the schema for BA configuration should allow for open content in order to
accomodate extensibility elements/attributes.
-
Configuration versioning: given the current target namespace for the
configuration schema (
http://www.microsoft.com/practices/referencearchitecture/services/03-08-2004/ReferenceArchitectureSection.xsd)
it's only natural to infer that versioning will be handled through namespace
changes, according to the release date. There's a
lot of discussion in the community about schema versioning, but most
agree that versioning through namespace changes is not recommended.
This document explains in a short and consice manner the available
options. My suggesion to make the migration path in the future when
configuration is upgraded as easy as possible for developers (and an optional
upgrade tool) is to use the optional XSD version attribute in the schema,
together with a new schemaVersion attribute in the configuration file. The
schema would look like the following:
...etc...
While the configuration would include the appropriate version attribute:
Now when v2 comes out, a tool can detect the version in the configuration file,
and perform any relevant upgrade (for example through an XSLT transformation to
accommodate elements to the new format).
Finally, special care should be taken to specify the
type attribute on
all attribute declarations.
Of course configuration is just the tip of the iceberg of such a comprehensive
product. Shadowfax is a very interesting architecture to build applications on
top. MS is very open on feedback from the community, so I expect it to become
more and more polished and sleek over time. These are my 2 cents with regards
to its configuration file.
If I misunderstood some points in the schema design, I'd be glad to hear from
the Sfx dev guys!