xmlHelpline.com logo

Xml Schema Lightener

making schemas simpler


FAQ


What is it?

The Xml Schema Lightener is a tool that will help you prune unneeded elements, attributes and data types from your Xml Schema. Sometimes referred to as "LiteBODs", "Schema subsets", "constrained Schemas", or "pruned Schemas", this tool will help you create a simplified Xml Schema that conforms to an original. In a literal sense, the Xml Schema Lightener is an XSLT stylesheet that is applied to your Schema.


Why would I need it?

Consider these scenarios.

In any of these scenarios, you want a simplified Xml Schema to be consistent with the original. Specifically, you want an Xml instance that is valid against the subset to also validate against the original. But how can this be done? Previously, there were only 2 principle solutions for this. First, edit the original schema. This works but is tedious and risks hand editing errors. Second, you could implement Schematron technology. This also works, but may not always be possible (see How does this relate to Schematron? ).

The Xml Schema Lightener will help you create a subset / LiteBOD / pruned Schema from any original Schema. The image below illustrates how the tool fits into an Xml management context.

Click to enlarge


How does it work?

An XSLT stylesheet is applied to the original Xml Schema. It takes as input an Xml instance which is valid against that Schema. The instance contains the subset desired in the result. The XSLT will prune the Schema, removing elements, attributes, and data types that are not needed to validate the given Xml instance.
Include all the data in the instance that you want and the Lightener will create a Schema that will only validate nodes in that instance.

The image below illustrates how how it works.

Click to enlarge


How much of a benefit do I get?

Statistically, the reduction in the schema is dependent on the Xml instance. The instance contains Xml nodes to be included in the result. The smaller the Xml instance, the smaller the Schema because it contains fewer elements, attributes and data types. Testing was done on the release libraries OAGIS version 9.1 and HR-XML version 2.5.

To illustrate how much benefit could be gained, the latest public release of the HR-XML library of Schemas was used to for analysis. A total of 128 unique Xml instances were used from that release. The Schemas which governed them were run through the Schema Lightener and the resulting Schema was used to compare how much of a reduction in components was achieved.

The results:

  • Number of original Schemas: ..... 128
  • Number of result "lite" Schemas created: ..... 128
  • Percent of result Schemas passing validity: ..... 100%
  • Percent of Xml instances that validate against result "lite" Schemas: ..... 100%
  • Average reduction in number of global elements: ..... 75.3%
  • Average reduction in number of global simpleTypes: ..... 67.3%
  • Average reduction in number of global complexTypes: ..... 62.6%

Click to see spreadsheet data detail


How does this relate to other solutions?

The Lightener is meant to compliment other methods of creating constrained Schemas. There are software tools that can help, including Hypermodel and Gefeg as well as the Schematron specification.

Schematron is a technology that enables additional constraints to be added on top of an Xml Schema. It is a more robust solution to the scenarios described in this FAQ, and xmlHelpline supports the Schematron effort.

The Lightener is intended for contexts where Schematron is not possible. For whatever reason, there may be technology constraints that prevent full implementation of Schematron.

The Lightener is also intended for the use case where a resulting Xml Schema is a requirement. Validating Schematron constraints requires additional technology at validation time. The Lightener, while it uses Xslt to create the subset Schema, does not require any new technology at validation time.


What are limitations and known issues?

The Xml Schema Lightener ...

  • removes unused elements that do not appear in the Xml instance given. However, it removes only globally scoped data types. So the maximum benefit occurs with globally scoped components.
  • removes unused types contained in an xsd:union. However if a type is a union of another unioned type (nested unions), it will not be removed.
  • does not remove unused xsd:group and xsd:attributeGroup components. This may be incorporated into a future version.
  • removes unused nested derived types (i.e. xsd:extension and xsd:restriction)
  • does not work on dependent schemas in the same namespace that are included (xsd:include). So it is best used on a "stand alone" version of your schema rather than nested xsd files.
  • works with Schemas which contains imports (xsd:import) of components in other namespaces. Applying the lightener to each schema file will create a valid result.
  • has not yet been tested on these Xml Schema features: redefine, abstract types.
  • will remove the vast majority of unneeded data types. However an absolute minimum of types requires a more robust solution. It creates liter schemas, which is why it is called a schema "Lightener".

How can I get access to it?

Email Paul Kiel and request access.


Why did you create this?

Over the years, I have had many requests for this functionality from clients. I have also had requests for this from standards Consortia members who want to simplify Schemas that a Consortium releases as a standard. I finally got a moment to devote to its completion. I am also currently working (spare time permitting) on a more robust version of this tool that will make the absolute maximum reduction in elements, attributes and types. But this version gets you a pretty good return.