We can use xml-validation-service to validate a number of things about an XML document by using a list of MessageValidator implementations. Multiple validation steps can be done in a single xml-validation-service instance; so you could
- Check that a message matches a given schema
- Check that certain xpaths match given criteria that isn’t defined in the schema.
The standard types of MessageValidator are:
Type | Description |
---|---|
xml-basic-validator | Validates that the input document is in fact XML |
xml-schema-validator | Validates that the input document is matches a given schema |
xml-rule-validator | Validates that the input document matches various criteria |
Validating against a schema
Validating an XML schema is quite simple; you just need to provide a schema
location which could be on the local filesystem, hosted remotely and accessible via standard URL handling mechanisms (e.g. http). Optionally you can specify a schema-metadata-key
; which if present in the message, overrides any configured schema that will be used to validate the message.
<xml-validation-service>
<validators>
<xml-schema-validator>
<schema>http://myhost.com/schema.xsd</schema>
</xml-schema-validator>
</validators>
</xml-validation-service>
XML Rule Validation
xml-rule-validator allows us to validate arbitrary elements within the XML against some configured criteria. This is useful if there is a canonical data model, but we know that some parties cannot handle certain values in the XML. Each configured xml-rule-validator is made up of a list of ValidationStage instances (which is responsible for iterating over the document) and a list of ContentValidation instances associated with each ValidationStage.
Each ValidationStage contains a iteration-xpath
and element-xpath
which will be used to iterate over the XML document. For each Node
returned by the iteration-xpath
, we execute element-xpath
on that node and apply all the configured ContentValidation instances to the text returned. If validation fails then an exception will be thrown and error handling is triggered.
The basic types of validator are xml-content-not-null, xml-content-is-null, xml-content-in-list, xml-content-not-in-list and xml-content-regexp:
Rule Validator | Description |
---|---|
xml-content-not-null | Asserts that the trimmed content value is not blank |
xml-content-is-null | Asserts that the trimmed content value is blank |
xml-content-in-list | Asserts that the content is in the list of configured items |
xml-content-not-in-list | Asserts that the content is not in the list of configured items |
xml-content-regexp | Asserts that the content matches the configured regular expression |
Example
Given the example made up document below; we want to apply the following validation rules:
- Cronos and Rhea had children
- They all have names
- Their names must only have alphabetic characters in them.
- The names of Cronos and Rhea’s children must be one of Zeus, Demeter, Hades, Hera
- Zeus had quite a few children
- They all have names
- Zeus’s children aren’t called Helios, Selene or Eos
- Poor old Hades didn’t have any children.
<document>
<parent id="Cronus">
<child mother="Rhea"><name>Zeus</name></child>
<child mother="Rhea"><name>Demeter</name></child>
<child mother="Rhea"><name>Hades</name></child>
<child mother="Rhea"><name>Hera</name></child>
</parent>
<parent id="Coeus">
<child mother="Phoebe"><name>Asteria</name></child>
<child mother="Phoebe"><name>Leto</name></child>
</parent>
<parent id="Hyperion">
<child mother="Theia"><name>Helios</name></child>
<child mother="Theia"><name>Selene</name></child>
<child mother="Theia"><name>Eos</name></child>
</parent>
<parent id="Zeus">
<child mother="Leto"><name>Apollo</name></child>
<child mother="Leto"><name>Artemis</name></child>
<child mother="Hera"><name>Hebe</name></child>
<child mother="Hera"><name>Ares</name></child>
<child mother="Hera"><name>Hephaestus</name></child>
<child mother="Dione"><name>Aphrodite</name></child>
<child mother="Demeter"><name>Persephone</name></child>
</parent>
<parent id="Hades">
</parent>
</document>
Our configuration would be :
<xml-validation-service>
<validators>
<xml-rule-validator>
<validation-stage>
<iteration-xpath>/document/parent[@name='Cronos']</iteration-xpath>
<element-xpath>child[@mother='Rhea']/name</element-xpath>
<rules>
<xml-content-not-null/>
<xml-content-regexp>
<pattern>[a-zA-Z]+</pattern>
</xml-content-regex>
<xml-content-in-list>
<list-entry>Zeus</list-entry>
<list-entry>Demeter</list-entry>
<list-entry>Hades</list-entry>
<list-entry>Hera</list-entry>
</xml-content-in-list>
</rules>
</validation-stage>
<validation-stage>
<iteration-xpath>/document/parent[@id='Zeus']</iteration-xpath>
<element-xpath>child/name</element-xpath>
<rules>
<xml-content-not-null/>
<xml-content-not-in-list>
<list-entry>Helios</list-entry>
<list-entry>Selene</list-entry>
<list-entry>Eos</list-entry>
</xml-content-not-in-list>
</rules>
</validation-stage>
<validation-stage>
<iteration-xpath>/document/parent[@id='Hades']</iteration-xpath>
<element-xpath>child/name</element-xpath>
<fail-on-iterate-failure>true</fail-on-iterate-failure>
<rules>
<xml-content-is-null/>
</rules>
</validation-stage>
</xml-rule-validator>
</validators>
</xml-validation-service>
iteration-xpath
returns no nodes will not cause a failure unless fail-on-iterate-failure=true
.If our document was modified to be
<document>
<parent id="Poseidon">
</parent>
</document>
Then validation would fail, but only because /document/parent[@id='Hades']
does not exist in the document; the other ValidationStages would still pass.