Вы находитесь на странице: 1из 42

XML Schema

Neeraj Singh
October 2009

© 2008 MindTree Consulting


Agenda

XML Validation
Introduction to XML Schema
Examples / Demo

Slide 2
XML Validation

© 2008 MindTree Consulting


An Introduction to XML Validation

One of the important innovations of XML is the ability to place


preconditions on the data the programs read, and to do this in a
simple declarative way.
XML allows you to say
that every Order element must contain exactly one Customer element,
that each Customer element must have an id attribute that contains an
XML name token,
that every ShipTo element must contain one or more Streets, one City,
one State, and one Zip, and so forth.

Checking an XML document against this list of conditions is called


validation.
Validation is an optional step but an important one.

Slide 4
Validation

There are many reasons and opportunities to validate an XML document:


When we receive one, before importing data into a legacy system
When we receive one, before importing data into a legacy system, when we have
produced or hand-edited one
To test the output of an application, etc.
Validation as “firewall”
to serve as actual firewalls when we receive documents from the external world
(as is commonly the case with Web Services and other XML communications),
to provide check points when we design processes as pipelines of transformations.
Validation can take place at several levels.
Structural validation
Data validation

Slide 5
Schema Languages

There is more than one language in which you can express such
validation conditions. Generically, these are called schema
languages, and the documents that list the constraints are called
schemas.
Different schema languages have different strengths and
weaknesses.
The document type definition (DTD) is the only schema language
built into most XML parsers and endorsed as a standard part of XML.
The W3C XML Schema Language (schemas for short, though it’s
hardly the only schema language) addresses several limitations of
DTDs.
Many other schema languages have been invented that can easily
be integrated with your systems.

Slide 6
XML Schema

© 2008 MindTree Consulting


XML Schema Introduction

W3C XML Schema (Schema) is an XML-based technology that is


considered a replacement for DTDs. Just like DTDs, schemas are
used for defining the constraints of an XML document. But unlike
DTDs, they provide strong data typing and support for namespaces
-- and since they are based on XML, they are also extensible.
Advantage of XML Schema over DTD
Schemas are written in XML instance document syntax, using tags,
elements, and attributes.
Schemas are fully namespace aware.
Schemas can assign data types like integer and date to elements, and
validate documents not only based on the element structure but also on
the contents of the elements.

Slide 8
Schema definition

A schema is defined in a separate file and generally stored with the


.xsd extension.
Every schema definition has a schema root element that belongs to
the http://www.w3.org/2001/XMLSchema namespace. The schema
element can also contain optional attributes.
For example:
The following example indicates that the elements used in the schema
come from the http://www.w3.org/2001/XMLSchema namespace.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!– Other definitions will come here.-->
</xs:schema>

Slide 9
Schema Linking when document root element is from null namespace

Let's start with our first document. It must have only "root"
element and this element can contain text only. The element is
from null namespace. Valid document –
<root xmlns="">aaa</root>
If you want to validate this document with XML Schema, you have
to associate some Schema document with it. If the root element is
from null namespace, you will use "noNamespaceSchemaLocation"
attribute.
<root xsi:noNamespaceSchemaLocation="correct_0.xsd" xmlns=""
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > test
</root>

Slide 10
Schema Linking when document root element from some particular
namespace

Now, let's have the same document as in previous example, but the
"root" element must be from some concrete namespace, let's say
"http://foo". Valid document
<root xmlns="http://foo" >aaa</root>
If the root element is from some particular namespace, you
associate the Schema using "schemaLocation" attribute. The first
part of this attribute is the target namespace, the second one the
URL of the Schema file.
<f:root xsi:schemaLocation="http://foo correct_0.xsd"
xmlns:f="http://foo" xmlns:xsi="http://www.w3.org/2001/XMLSchema-
instance" > test </f:root>

Slide 11
Example
s / Demo

01_FirstXMLSchema.xsd
Writing your first XML Schema and a valid XML file based on this. This
will also demonstrate how to link a XML file with a XML schema.

02_FirstNameSpace.xsd
This example demonstrate the use of namespace. If you have a xml
document that belongs to certain namespace, how to connect to a XML
Schema.

Slide 12
Schema elements

A schema file contains definitions for element and attributes, as


well as data types for elements and attributes. It is also used to
define the structure or the content model of an XML document.
Elements in a schema file can be classified as either simple or complex
Schema elements: Simple type
A simple type element is an element that cannot contain any attributes
or child elements; it can only contain the data type specified in its
declaration. The syntax for defining a simple element is:
<xs:element name="ELEMENT_NAME" type="DATA_TYPE" default/fixed="VALUE" />
Where DATA_TYPE is one of the built-in schema data types

Slide 13
Schema elements: Simple type Contd…

You can also specify default or fixed values for an element. You do
this with either the default or fixed attribute and specify a value
for the attribute. Note: Specifying a fixed or default attribute is
optional.
An example of a simple type element is:
<xs:element name="Author" type="xs:string" default="Whizlabs"/>
All attributes are simple types, so they are defined in the same
way that simple elements are defined. For example:
<xs:attribute name="title" type="xs:string" />

Slide 14
All complex types
Schema data types

All data types in schema


inherit from anyType.
This includes both simple
and complex data types.
You can further classify
simple types into built-
in-primitive types and
built-in-derived types.
Built-in datatype
hierarchy
 A complete hierarchical
diagram from the XML
Schema Datatypes
Recommendation is
shown below.

ur types – derived by restriction


built-in primitive types – derived by list
built-in primitive types – derived by
extension or restriction
Complex types

Slide 15
Schema elements: Complex types

 Complex types are elements that either:


 Contain other elements
 Contain attributes
 Are empty (empty elements)
 Contain text
 To define a complex type in a schema, use a complexType element.
 You can specify the order of occurrence and the number of times an element can occur (cardinality) by using
the order and occurrence indicators, respectively.
 For example:
<xs:element name="Book">
<xs:complexType>
<xs:sequence>
<xs:element name="Name" type="xs:string" />
<xs:element name="Author" type="xs:string" maxOccurs="4"/>
<xs:element name="ID" type="xs:string"/>
<xs:element name="Price" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>

 In this example, the order indicator is xs:sequence, and the occurrence indicator is maxOccurs in the Author element name.

Slide 16
Schema elements: Complex types (Mixed content)

W3C XML Schema supports mixed It will validate an XML


content though the mixed attribute in element such as:
the xs:complexType elements. Consider  <book isbn="0836217462">
Funny book by
<xs:element name="book">
<author>Charles M. Schulz</author>.
<xs:complexType mixed="true">
Its title (<title>Being a Dog Is a Full-
<xs:all> Time Job</title>) says it all !
</book>
<xs:element name="title" type="xs:string"/>
<xs:element name="author" type="xs:string"/>
</xs:all>
<xs:attribute name="isbn" type="xs:string"/>
</xs:complexType>
</xs:element>

Slide 17
Example
s / Demo

07_ComplexType01.xsd
Your first complex type. Element can contain a mixture of elements.
Now, we want the element "root" to contain elements "aaa", "bbb", and
"ccc" in any order. We will use the "all" element. It also demonstrate the
use of All.

11_EmptyElementUsingAnyType.xsd
Empty element. We want to have the root element to be named "AAA",
from null namespace and empty. The empty element is defined as a
"complexType" with a "complexContent" which is a restriction of
"anyType", but without any elements.

Slide 18
Occurrence indicators

Occurrence indicators specify the number of times an element can


occur in an XML document. You specify them with the minOccurs
and maxOccurs attributes of the element in the element definition.
As the names suggest, minOccurs specifies the minimum number of
times an element can occur in an XML document while maxOccurs
specifies the maximum number of times the element can occur.
It is possible to specify that an element might occur any number of times
in an XML document. This is determined by setting the maxOccurs value
to unbounded.
The default values for both minOccurs and maxOccurs is 1, which means
that by default an element or attribute can appear exactly one time.

Slide 19
Order indicators

Order indicators define the order or sequence in which elements


can occur in an XML document. Three types of order indicators are:
All: If All is the order indicator, then the defined elements can appear in
any order and must occur only once. Remember that both the maxOccurs
and minOccurs values for All are always 1.
Sequence: If Sequence is the order indicator, then the elements must
appear in the order specified.
Choice: If Choice is the order indicator, then any one of the elements
specified must appear in the XML document.

Slide 20
Example: Occurrence and order indicators

<xs:element name="Book"> the <xs:all> indicator specifies that the


<xs:complexType>
<xs:all>
Book element, if present, must contain
<xs:element name="Name" type="xs:string" /> only one instance of each of the following
<xs:element name="ID" type="xs:string"/> four elements: Name, ID, Authors, Price.
<xs:element name="Authors" type="authorType"/>
<xs:element name="Price" type="priceType"/> The xs:sequence indicator in the
</xs:all>
authorType declaration specifies that
</xs:complexType>
</xs:element> elements of this particular type (Authors
<xs:complexType name="authorType"> element) contain at least one Author
<xs:sequence>
element and can contain up to four
<xs:element name="Author" type="xs:string" maxOccurs="4"/>
</xs:sequence> Author elements.
</xs:complexType >
<xs:complexType name="priceType">
The xs:choice indicator in the priceType
<xs:choice> declaration specifies that elements of
<xs:element name="dollars" type="xs:double" />
this particular type (Price element) can
<xs:element name="pounds" type="xs:double" />
</xs:choice> contain either a dollars element or a
</xs:complexType > pounds element, but not both.

Slide 21
Restriction

A main advantage of schema is that you have the ability to control


the value of XML attributes and elements.
A restriction, which applies to all of the simple data elements in a
schema, allows you to define your own data type according to the
requirements by modifying the facets available for a particular
simple type.
To achieve this, use the restriction element defined in the schema
namespace.

W3C XML Schema defines 12 facets for simple data types.


Enumeration, maxExclusive, minExclusive, maxInclusive, minInclusive,
maxLength, minLength, pattern, length, whiteSpace, fractionDigits,
totalDigits

Slide 22
Example - To restrict the length of the text node

An example that shows how to restrict the length of the text node
<xs:element name="title">
<xs:complexType>
<xs:simpleContent>
<xs:restriction base="tokenWithLangAndNote">
<xs:maxLength value="255"/>
<xs:attribute name="lang" type="xs:language"/>
<xs:attribute name="note" type="xs:token"/>
</xs:restriction>
</xs:simpleContent>
</xs:complexType>
</xs:element>

Slide 23
Example – Remove an attribute from the element

To remove the note attribute from the element title, we declare note to
be prohibited in the list of attributes in the restriction:
<xs:element name="title">
<xs:complexType>
<xs:simpleContent>
<xs:restriction base="tokenWithLangAndNote">
<xs:maxLength value="255"/>
<xs:attribute name="lang" type="xs:language"/>
<xs:attribute name="note" use="prohibited"/>
</xs:restriction>
</xs:simpleContent>
</xs:complexType>
</xs:element>

Slide 24
Facets

enumeration - Value of the data maxExclusive - Numeric value of


type is constrained to a specific the data type is less than the
set of values. value specified.
<xs:simpleType name="Subjects">
minExclusive -Numeric value of
<xs:restriction base="xs:string">
<xs:enumeration value="Biology"/>
the data type is greater than the
<xs:enumeration value="History"/>
value specified.
<xs:enumeration value="Geology"/> <xs:simpleType name="id">
</xs:restriction> <xs:restriction base="xs:integer">
</xs:simpleType> <xs:maxExclusive value="101"/>
<xs:minExclusive value="1"/>
</xs:restriction>
</xs:simpleType>

Slide 25
Facets Contd…

maxInclusive - Numeric value of maxLength - Specifies the maximum


number of characters or list items
the data type is less than or
allowed in the value.
equal to the value specified.
minLength - Specifies the minimum
minInclusive - Numeric value of number of characters or list items
the data type is greater than or allowed in the value.

equal to the value specified. pattern - Value of the data type is


constrained to a specific sequence of
<xs:simpleType name="id">
characters that are expressed using
<xs:restriction base="xs:integer"> regular expressions.
<xs:minInclusive value="0"/> <xs:simpleType name="nameFormat">
<xs:restriction base="xs:string">
<xs:maxInclusive value="100"/>
<xs:minLength value="3"/>
</xs:restriction> <xs:maxLength value="10"/>

</xs:simpleType> <xs:pattern value="[a-z][A-Z]*"/>


</xs:restriction>
</xs:simpleType>
Slide 26
Facets Contd…

length - Specifies the exact number of


characters or list items allowed in the fractionDigits - Constrains the
value. maximum number of decimal
<xs:simpleType name="secretCode">
places allowed in the value.
<xs:restriction base="xs:string">
<xs:length value="5"/> totalDigits - The number of
</xs:restriction> digits allowed in the value.
</xs:simpleType>
<xs:simpleType name="reducedPrice">
whiteSpace - Specifies the method for
handling white space. Allowed values for <xs:restriction base="xs:float">
the value attribute are preserve,
replace, and collapse. <xs:totalDigits value="4"/>

<xs:simpleType name="FirstName"> <xs:fractionDigits value="2"/>


<xs:restriction base="xs:string"> </xs:restriction>
<xs:whiteSpace value="preserve"/>
</xs:simpleType>
</xs:restriction>
</xs:simpleType>

Slide 27
Multiple Restriction using ‘Union’

The union has been applied on the two embedded simple types to allow values from
both data types, our new data type will now accept the values from an enumeration
with two possible values (TBD and NA).

<xs:simpleType name="isbnType">
<xs:union>
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[0-9]{10}"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType>
<xs:restriction base="xs:NMTOKEN">
<xs:enumeration value="TBD"/>
<xs:enumeration value="NA"/>
</xs:restriction>
</xs:simpleType>
</xs:union>
</xs:simpleType>

Slide 28
Example
s / Demo

03_RestrictSimpleType01.xsd
This example restricts a simple type. Here we will require the value of
the element "root" to be integer and less than 25.
04_RestrictUsingUnion01.xsd
We want the element "root" to be from the range 0-100 or 300-400
(including the border values). We will make a union from two intervals.
06_RestrictUnionEnum02.xsd
Element can contain a string from an enumerated set. Now, we want the
element "root" to have a value "N/A" or "#REF!".
14_RestrictionOfSequence.xsd
The Schema declares type "AAA", which can contain up to two sequences
of "x" and "y" elements. Then we declare the type "BBB", which is a
restriction of the type "AAA" and contain only one x-y sequence.

Slide 29
Extension

The extension element defines complex types that might derive from other
complex or simple types.
If the base type is a simple type, then the complex type can only add attributes.
If the base type is a complex type, then it is possible to add attributes and
elements.

To derive from a complex type, you have to use the complexContent
element in conjunction with the base attribute of the extension element.
Extensions are particularly useful when you need to reuse complex element
definitions in other complex element definitions.
For example, it is possible to define a Name element that contains two child
elements (First and Last) and then reuse it in other complex element definitions.

Slide 30
An example of extensions

<!--Base element definition --> <!-- Student element that reuses it -->
<xs:complexType name="Name"> <xs:complexType name="Student">
<xs:sequence>
<xs:complexContent>
<xs:element name="First"/>
<xs:extension base="Name">
<xs:element name="Last"/>
</xs:sequence> <xs:sequence>

</xs:complexType> <xs:element name="school" type="xs:string"/>


<xs:element name="year" type="xs:string"/>

<!-- Customer element that reuses it --> </xs:sequence>

<xs:complexType name="Customer"> </xs:extension>


<xs:complexContent> </xs:complexContent>
<xs:extension base="Name"> </xs:complexType>
<xs:sequence>
<xs:element name="phone" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>

Slide 31
Example
s / Demo

12_ExtensionOfSequence.xsd
Extension of a sequence. When we extend the complexType, which
contains a sequence A with a sequence B, then the sequence B will be
appended to sequence A.

Slide 32
Groups

W3C XML Schema also allows the definition of groups W3C XML Schema also allows the
of elements and attributes. definition of groups of elements
 These groups are not datatypes but containers holding a and attributes.
set of elements or attributes that can be used to describe
complex types. <xs:complexType name="bookType">

<!-- definition of an element group --> <xs:sequence>

<xs:group name="mainBookElements"> <xs:group ref="mainBookElements"/>


<xs:sequence> <xs:element name="character"
<xs:element name="title" type="nameType"/> type="characterType"
<xs:element name="author" type="nameType"/> minOccurs="0"
maxOccurs="unbounded"/>
</xs:sequence>
</xs:group> </xs:sequence>

<!-- definition of an attribute group --> <xs:attributeGroup


ref="bookAttributes"/>
<xs:attributeGroup name="bookAttributes">
<xs:attribute name="isbn" type="isbnType" use="required"/> </xs:complexType>

<xs:attribute name="available" type="xs:string"/>


</xs:attributeGroup>

Slide 33
Example
s / Demo

08_AttributeGroup01.xsd
Defining a group of attributes. Let's say we want to define a group of
common attributes, which will be reused. The root element is named
"root", it must contain the "aaa" and "bbb" elements, and these elements
must have attributes "x" and "y".

12_SequenceChoiceGroup.xsd
Element which contains two "patterns" (sequences), in any order. We
want to have the root element to be named "AAA", from null namespace
and contains two patterns in any order. The first pattern is a sequence of
"BBB" and "CCC" elements, the second one is a sequence of "XXX" and
"YYY" element. The element "choice" allows one of the cases: either the
sequence "myFirstSequence"-"mySecondSequence" or
"mySecondSequence"-"myFirstSequence".

Slide 34
List Datatypes

The definition of a list datatype can


List datatypes are special cases in also be done by embedding a
which a structure is defined within xs:simpleType element:
the content of a single attribute or <xs:simpleType name="myIntegerList">
element.
<xs:list>
IDREFS, ENTITIES, and NMTOKENS are
<xs:simpleType>
predefined list datatypes
<xs:restriction base="xs:integer">
As we have seen with these three
datatypes, all the list datatypes that <xs:maxInclusive value="100"/>
can be defined must be whitespace- </xs:restriction>
separated. No other separator is
accepted. </xs:simpleType>
</xs:list>
The definition of a list datatype by
reference to an existing type is done </xs:simpleType>
through a itemType attribute: This datatype can be used to define
<xs:simpleType name="integerList"> attributes or elements that accept a
<xs:list itemType="xs:integer"/> whitespace-separated list of integers
</xs:simpleType> smaller than or equal to 100 such as: "1
-25000 100."
Slide 35
Example
s / Demo

09_ListDataType01.xsd
Attribute contains a list of values. Now, we want the "root" element to
have attribute "xyz", which contains a list of three integers. We will
define a general list (element "list") of integers and then restrict it
(element "restriction") to have exact length (element "length") of three
items.

10_ListDataType02.xsd
Element contains a list of values. Now, we want the "root" element to
contain a list of three integers. We will define a general list (element
"list") of integers and then restrict it (element "restriction") to have exact
length (element "length") of three items.

Slide 36
Example
s / Demo

More Examples

© 2008 MindTree Consulting


Example
s / Demo

15_CustomSimpleType.xsd
Definition of a custom simpleType - temperature must be greater than
-273.15. The element "T" must contain number greater than -273.15. We
will define our custom type for temperature named "Temperature" and
will require the element "T" to be of that type.

16_PatternElement.xsd
String must contain e-mail address. The element "A" must contain an
email address. We will define our custom type, which will at least
approximately check the validity of the address. We will use the
"pattern" element, to restrict the string using regular expressions.

Slide 38
Summary

W3C XML Schema has become the de facto standard for defining
the structure of an XML document and for checking the validity of
XML documents. Using schema, it is possible to define:
Elements (simple and complex)
Attributes
Facets for XML elements
The structure of a document (order indicators)
The allowable number of elements (occurrence indicators) in an XML
document

Slide 39
References

ibm.com/developerWorks
IBM XML certification success, Part 1:
W3schools.com
www.Xml.com
XML Schema by OReilly
http://www.zvon.org/xxl/XMLSchemaTutorial
Examples used in the presentation are attached here

XML-Schema-Project.zip

Slide 40
Questions

Slide 41
Thank you

XML Technology, Semester 4


SICSR Executive MBA(IT) @ MindTree, Bangalore, India

By Neeraj Singh (toneeraj(AT)gmail(DOT)com


)
Slide 42

Вам также может понравиться