Вы находитесь на странице: 1из 46

Handout: XML

Version: XML/Handout/0307/1.0
Date: 11-03-08

Cognizant
500 Glen Pointe Center West
Teaneck, NJ 07666
Ph: 201-801-0233
www.cognizant.com

XML - Handout

TABLE OF CONTENTS
Introduction ...................................................................................................................................5
About this Module .........................................................................................................................5
Target Audience ...........................................................................................................................5
Module Objectives ........................................................................................................................5
Pre-requisite .................................................................................................................................5
Session 02: DTD .............................................................................................................................6
Learning Objectives ......................................................................................................................6
Introduction ...................................................................................................................................6
Syntax ...........................................................................................................................................6
Elements and Attributes ...............................................................................................................7
Entity References .........................................................................................................................8
Well-formed and valid XML documents ........................................................................................8
Well-formed documents: XML syntax ...........................................................................................9
Valid documents: XML semantics ..............................................................................................10
Try It Out .....................................................................................................................................11
Summary ....................................................................................................................................12
Test Your Understanding............................................................................................................12
Exercises ....................................................................................................................................12
Session 04: Schema .....................................................................................................................13
Learning Objectives ....................................................................................................................13
Introduction .................................................................................................................................13
Simple Types: .............................................................................................................................14
Defining a Simple Element .........................................................................................................15
What is a Complex Element? .....................................................................................................15
Examples of Complex Elements.................................................................................................15
How to Define a Complex Element.............................................................................................16
Data Types .................................................................................................................................17
Name Conflicts ...........................................................................................................................17
Summary ....................................................................................................................................18
Test Your Understanding............................................................................................................18
Exercises ....................................................................................................................................18
Session 06: SAX ...........................................................................................................................19

Page 2
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
Learning Objectives ....................................................................................................................19
Introduction .................................................................................................................................19
Handling Events..........................................................................................................................20
Summary ....................................................................................................................................21
Test Your Understanding............................................................................................................22
Exercises ....................................................................................................................................22
Session 08: DOM ..........................................................................................................................23
Learning Objectives ....................................................................................................................23
DOM API.....................................................................................................................................23
DOM Tree Navigation .................................................................................................................23
Getting DOM Tree: .....................................................................................................................24
Summary ....................................................................................................................................25
Test Your Understanding............................................................................................................25
Exercises ....................................................................................................................................25
Session 10: JAXP .........................................................................................................................27
Learning Objectives ....................................................................................................................27
Introduction: ................................................................................................................................27
Summary ....................................................................................................................................29
Test Your Understanding............................................................................................................29
Session 12: XPath ........................................................................................................................30
Learning Objectives ....................................................................................................................30
XPath ..........................................................................................................................................30
What is XPath? ...........................................................................................................................30
Xpath Nodes ...............................................................................................................................30
Relationship of Nodes ................................................................................................................30
Selecting Nodes..........................................................................................................................31
Predicates ...................................................................................................................................31
Selecting Several Paths .............................................................................................................31
XPath Operators: ........................................................................................................................32
Summary ....................................................................................................................................33
Test Your Understanding............................................................................................................33
Exercises ....................................................................................................................................33
Session 14: X Query .....................................................................................................................35
Learning Objectives ....................................................................................................................35
XQuery........................................................................................................................................35

Page 3
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
What is XQuery?.........................................................................................................................35
Relationship of Nodes ................................................................................................................35
XQuery Comparisons .................................................................................................................36
Selecting and Filtering Elements ................................................................................................36
Summary ....................................................................................................................................37
Test Your Understanding............................................................................................................37
Exercises ....................................................................................................................................37
Session 16: XSLT .........................................................................................................................39
Learning Objectives ....................................................................................................................39
XSL .............................................................................................................................................39
What is XSLT? ............................................................................................................................39
The <xsl:template> element .......................................................................................................39
The <xsl:value-of> element ........................................................................................................40
The <xsl:for-each> element ........................................................................................................41
The <xsl:choose> element .........................................................................................................42
Summary ....................................................................................................................................42
Test Your Understanding............................................................................................................42
Exercises ....................................................................................................................................43
Glossary ........................................................................................................................................44
References ....................................................................................................................................45
Websites .....................................................................................................................................45
Books ..........................................................................................................................................45
STUDENT NOTES: ........................................................................................................................46

Page 4
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Introduction
About this Module
This module provides a handout on the following topics:

An introduction to XML

Basic concepts of XML

Target Audience
This module is designed for the entry level trainees.

Module Objectives
After completing this module, you will be able to:

Identify the purpose and importance of XML

Identify the XML elements and attributes

List the types of schema

Define SAX parser

Define DOM parser

Explain JAXP, XPath, XQuery, and XSLT

Pre-requisite
The pre-requisite of this course is that the audience taking the course should be familiar
with HTML and JavaScript.

Page 5
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Session 02: DTD

Learning Objectives
After completing this session, you will be able to:

Describe XML.

Identify elements and attributes

Explain well formed and valid XML documents

Introduction
What is XML?

XML stands for Extensible Markup Language

XML is a markup language much like HTML

XML was designed to describe data

XML tags are not predefined in it. You must define your own tags

XML uses a DTD (Document Type Definition) to describe the data

XML with a DTD is designed to be self describing

All XML elements must have a closing tag

XML Tags are case sensitive

Syntax

XML elements are defined using XML tags. XML tags are case sensitive. With XML, the tag
<Letter> is different from the tag <letter>. Opening and closing tags must be written with the
same case:
<Message>This is incorrect</message>
<message>This is correct</message>

XML elements must be properly nested.

In XML, all elements must be properly nested within each other:


<b><i>This text is bold and italic</i></b>

In the preceding example, "Properly nested" simply means that as the <i> element is opened
inside the <b> element, then it must be closed inside the <b> element.

XML documents must have a root element.

Page 6
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
XML documents must contain one element that is the parent of all other elements. This element is
called the root element.
<root>
<child>
<subchild>.....</subchild>
</child>
</root>

With XML white space is preserved.

Elements and Attributes


Each tag in an XML file can have elements and attributes. A typical tag looks like the following:
<Email
to="admin@mydomain.com"
from="user@mySite.com"
subject="Introducing XML">
</Email>

In this example, Email is called an element. This element called E-mail has three attributes, to,
from and, subject.
The following rules need to be followed while declaring the XML elements names:

Names can contain letters, numbers, and other characters

Names must not start with a number or "_" (underscore)

Names must not start with the letters xml (or XML or Xml.)

Names can not contain spaces

Any name can be used with no words being reserved, but the idea is to make names descriptive.
Names with an underscore separator are nice.
Examples: <author_name> , <published_date> .
Avoid "-" and "." in names. It could be a mess if your software tried to subtract name from first
(author-name) or think that "name" is a property of the object "author" (author.name).
Element names can be as long as you like, but do not exaggerate. Names should be short and
simple, like <author_name> and not like <name_of_the_author> .
XML documents often have a parallel database, where fieldnames are parallel with element
names. A good rule is to use the naming rules of your databases for easy explanation and
correlation.
Letters like , which are not English are perfectly legal in XML element names, but watch out for
problems if your software vendor does not support it.

Page 7
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
The ":" should not be used in element names because it is reserved to be used for something
called namespaces.
Empty Tags: In cases where you do not have to provide any sub tags, you can close the tag, by
providing a "/" to the closing tag. For example declaring like the following:
<Text></Text>

is same a declaring

<Text />

Entity References
Some characters have a special meaning in XML.
If you place a character like "<" inside an XML element, then it will generate an error because the
parser interprets it as the start of a new element.
This will generate an XML error:
<message>if salary < 1000 then</message>

To avoid this error, replace the "<" character with an entity reference:
<message>if salary &lt; 1000 then</message>

There are five predefined entity references in XML:


&lt;

<

less than

&gt;

>

greater than

&amp;

&

ampersand

&apos;

'

apostrophe

&quot;

"

quotation mark

Note: Only the characters "<" and "&" are strictly illegal in XML. The greater than character is
legal, but it is a good habit to replace it.

Well-formed and valid XML documents


There are two levels of correctness of an XML document:

Well-formed: A well-formed document conforms to all the syntax rules of XML. For
example, if a start-tag appears without a corresponding end-tag, then it is not wellformed. A document that is not well-formed is not considered to be XML document.
A conforming parser is not allowed to process it.

Valid: .A valid document additionally conforms to some semantic rules. These rules
are either defined by user, or included as an XML schema or DTD. For example, if a
document contains an undefined element, then it is not valid. A validating parser is
not allowed to process it.

Page 8
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Well-formed documents: XML syntax


As long as only well-formedness is required, XML is a generic framework for storing any amount of
text or any data whose structure can be represented as a tree. The only indispensable syntactical
requirement is that the document has exactly one root element (alternatively called the document
element). This means that the text must be enclosed between a root start-tag and a corresponding
end-tag. The following is a "well-formed" XML document:
<book>This is a book.... </book>
The root element can be preceded by an optional XML declaration. This element states what
version of XML is in use (normally 1.0). It may also contain information about character encoding
and external dependencies.
<?xml version="1.0" encoding="UTF-8"?>
The specification requires that processors of XML support the pan-Unicode character encodings
UTF-8 and UTF-16 (UTF-32 is not mandatory). The use of more limited encodings, such as those
based on ISO/IEC 8859, is acknowledged and is widely used and supported.
Comments can be placed anywhere in the tree, including in the text if the content of the element is
text or #PCDATA.
XML comments start with <!-- and end with -->. Two dashes (--) may not appear anywhere in
the text of the comment.
<!-- This is a comment. -->
In any meaningful application, additional markup is used to structure the contents of the XML
document. The text enclosed by the root tags may contain an arbitrary number of XML elements.
The basic syntax for one element is as follows:
<name attribute="value">content</name>
The two instances of name are referred to as the start-tag and end-tag, respectively.
Here, content is some text, which may again contain XML elements. So, a generic XML
document contains a data structure based on tree. Here is an example of a structured XML
document:
<recipe name="bread" prep_time="5 mins" cook_time="3 hours">
<title>Basic bread</title>
<ingredient amount="3" unit="cups">Flour</ingredient>
<ingredient amount="0.25" unit="ounce">Yeast</ingredient>
<ingredient amount="1.5" unit="cups" state="warm">Water</ingredient>
<ingredient amount="1" unit="teaspoon">Salt</ingredient>
<instructions>
<step>Mix all ingredients together.</step>
<step>Knead thoroughly.</step>

Page 9
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
<step>Cover with a cloth, and leave for one hour in warm
room.</step>
<step>Knead again.</step>
<step>Place in a bread baking tin.</step>
<step>Cover with a cloth, and leave for one hour in warm
room.</step>
<step>Bake in the oven at 350F for 30 minutes.</step>
</instructions>
</recipe>
Attribute values must always be quoted, using single or double quotes and each attribute name
should appear only once in any element.
XML requires that elements must be properly nested that is elements may never overlap.

Valid documents: XML semantics


By leaving the names, allowable hierarchy, and meanings of the elements and attributes open and
definable by a customizable schema or DTD, XML provides a syntactic foundation for the creation
of markup languages those are specific to purpose and based on XML. The general syntax of such
languages is rigid. Documents must adhere to the general rules of XML, ensuring that all
softwares those are aware of XML can at least read and explain the relative arrangement of
information within them. The schema merely supplements the syntax rules with a set of
constraints. Schemas typically restrict element and attribute names and their allowable
containment hierarchies, such as only allowing an element named 'birthday' to contain one
element named 'month' and one element named 'day', each of which has to contain only character
data. The constraints in a schema may also include data type assignments that affect how
information is processed, for example, the character data of the month element may be defined
as being a month according to the conventions of a particular schema language, perhaps meaning
that it must not only be formatted in a certain way, but also must not be processed as if it were
some other type of data.
An XML document that complies with a particular schema or DTD, in addition to being well-formed,
is said to be valid.
An XML schema is a description of a type of XML document, typically expressed in terms of
constraints on the structure and content of documents of that type, preceding and following the
basic constraints imposed by XML itself. A number of standard and proprietary XML schema
languages have emerged for the purpose of formally expressing such schemas, and some of
these languages are themselves based on XML.
Before the advent of generalized data description languages such as SGML (Standard
Generalized Markup Language) and XML, software designers had to define special file formats or
small languages to share data between programs. This required writing detailed specifications and
parsers and writers for special purpose.

Page 10
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
The regular structure and strict parsing rules of XML allow software designers to leave parsing to
standard tools, and as XML provides a general, data model-oriented framework for the
development of languages those are specific to application, software designers need to only
concentrate on the development of rules for their data, at relatively high levels of abstraction.
The tools, which are tested well, exist to validate an XML document "against" a schema. The tool
automatically verifies whether the document conforms to constraints expressed in the schema.
Some of these validation tools are included in XML parsers, and some are packaged separately.
Other usages of schemas also exist. XML editors, for instance, can use schemas to support the
editing process (by suggesting valid elements and attributes names, and so on).
DTD: The oldest schema format for XML is the Document Type Definition (DTD), inherited from
SGML. While DTD support is ubiquitous due to its inclusion in the XML 1.0 standard, it is seen as
limited for the following reasons:

It has no support for newer features of XML that is most importantly namespaces.

It lacks expressiveness. Certain formal aspects of an XML document cannot be


captured in a DTD.

It uses a custom syntax that is not XML syntax, which is inherited from SGML, to
describe the schema.

DTD is still used in many applications because it is considered the easiest to read and write.

Try It Out
Problem Statement:
Write a DTD for an anthology consisting of poems, their titles, and the stanzas and lines of which
they are composed.
The XML file is as follows:
XML Code:
<anthology>
<poem><title>The SICK ROSE</title>
<stanza>
<line>O Rose thou art sick.</line>
<line>The invisible worm,</line>
<line>That flies in the night</line>
<line>In the howling storm:</line>
</stanza>
<stanza>
<line>Has found out thy bed</line>
<line>Of crimson joy:</line>
<line>And his dark secret love</line>
<line>Does thy life destroy.</line>
</stanza>
</poem>
<!-- more poems go here

-->

Page 11
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
</anthology>
DTD Code:
<!ELEMENT
<!ELEMENT
<!ELEMENT
<!ELEMENT

anthology
poem
title
stanza

<!ELEMENT line

(poem+)>
(title?, stanza+)>
(#PCDATA) >
(line+)
>
(#PCDATA) >

How It Works:

The root element of the given XML document is anthology.

So, the DTD specifies anthology for the DOCTYPE.

The anthology element can contain multiple project elements.

So, the DTD specifies (poem+) for anthology element, where the + indicates that
anthology element can contain multiple poem elements.

The stanza element can contain multiple line elements.

So, the DTD specifies (stanza+) for poem element, where the + indicates that the
team element can contain multiple stanza elements.

Summary
This session demonstrates how to write a DTD for a given XML document. It also demonstrates
how elements and attributes are used in an XML document.

Test Your Understanding


1. What is XML?
2. What is a DTD?
3. What are elements and attributes?

Exercises
Write a DTD for an XML file with library as the root element and books, author, date of publishing,
and edition as elements.

Page 12
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Session 04: Schema


Learning Objectives
After completing this session, you will be able to:

Define schema

Identify the types of schema

Introduction
XML schema is an alternative to DTD based on XML. It describes the structure of an XML
document. It is also referred to as XML Schema Definition (XSD).
The purpose of an XML schema is to define the legal building blocks of an XML document, just like
a DTD. An XML schema:

Defines elements that can appear in a document

Defines attributes that can appear in a document

Defines which elements are child elements

Defines the order of child elements

Defines the number of child elements

Defines whether an element is empty or can include text

Defines data types for elements and attributes

Defines default and fixed values for elements and attributes

Sample XML File:


<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Schema for above XML file:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>

Page 13
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The note element is a complex type because it contains other elements. The other elements (to,
from, heading, and body) are simple types because they do not contain other elements. You will
learn more about simple and complex types in the following sessions:
The following fragment:
xmlns:xs="http://www.w3.org/2001/XMLSchema"
This indicates that the elements and data types used in the schema come from the
"http://www.w3.org/2001/XMLSchema" namespace. It also specifies that the elements and data
types that come from the "http://www.w3.org/2001/XMLSchema" namespace should be prefixed
with xs:
This fragment:
targetNamespace="http://www.w3schools.com"
This indicates that the elements defined by this schema (note, to, from, heading, body.) come from
the "http://www.w3schools.com" namespace.
This fragment:
xmlns="http://www.w3schools.com"
This indicates that the default namespace is "http://www.w3schools.com".
This fragment:
elementFormDefault="qualified"
This indicates that any elements used by the XML instance document, which were declared in this
schema must be namespace qualified.

Simple Types:
A simple element is an XML element that can contain only text. It cannot contain any other
elements or attributes. The text can be of many different types. It can be one of the types included
in the XML Schema definition (boolean, string, date, and so on), or it can be a custom type
that you can define yourself. You can also add restrictions (facets) to a data type in order to limit its
content, or you can require the data to match a specific pattern.

Page 14
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Defining a Simple Element


The syntax for defining a simple element is:
<xs:element name="xxx" type="yyy"/>
Where xxx is the name of the element and yyy is the data type of the element.
XML schema has a lot of built-in data types. The most common types are:

xs:string

xs:decimal

xs:integer

xs:boolean

xs:date

xs:time

Example:
Here are some XML elements:
<lastname>Refsnes</lastname>
<age>36</age>
<dateborn>1970-03-27</dateborn>

Here are the corresponding simple element definitions:


<xs:element name="lastname" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>
Complex Types:

What is a Complex Element?


A complex element is an XML element that contains other elements and/or attributes.
There are four kinds of complex elements:

Empty elements

Elements that contain only other elements

Elements that contain only text

Elements that contain both other elements and text

Note: Each of these elements may contain attributes as well!

Examples of Complex Elements


A complex XML element, "product", which is empty:
<product pid="1345"/>

Page 15
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
A complex XML element, "employee", which contains only other elements:
<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>

A complex XML element, "food", which contains only text:


<food type="dessert">Ice cream</food>

A complex XML element, "description", which contains both elements and text:
<description>
It happened on <date lang="norwegian">03.03.99</date> ....
</description>

How to Define a Complex Element


This complex XML element, "employee" contains only two elements:
<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>
The "employee" element can have a type attribute that refers to the name of the complex type to
use:
<xs:element name="employee" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>

If you use the preceding method, then several elements can refer to the same complex type, like
this:
<xs:element name="employee" type="personinfo"/>
<xs:element name="student" type="personinfo"/>
<xs:element name="member" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>

Page 16
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Data Types
String data types are used for values that contain character strings. The string data type can
contain characters, line feeds, carriage returns, and tab characters.
<xs:element name="customer" type="xs:string"/>

Date Data Type:


The date data type is used to specify a date. The date is specified in the following form "YYYYMM-DD" where:

YYYY indicates the year

MM indicates the month

DD indicates the day

<xs:element name="start" type="xs:date"/>

Decimal Data Type:


The decimal data type is used to specify a numeric value. The following is an example of a
decimal declaration in a schema:
<xs:element name="prize" type="xs:decimal"/>

Namespaces: XML Namespaces provide a method to avoid element name conflicts.

Name Conflicts
In XML, element names are defined by the developer. This often results in a conflict when trying to
mix XML documents from different XML applications. The namespace is defined by the xmlns
attribute in the start tag of an element.
<root>
<h:table xmlns:h="http://www.w3.org/TR/html4/">
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table xmlns:f="http://www.w3schools.com/furniture">
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>

Page 17
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
Summary
The examples discussed earlier talk about different types of schemas and the syntax of their
respective types.

Test Your Understanding


1. What is a schema?
2. What are the types of schemas?

Exercises
Write XML schema for the following XML file.
<?xml version="1.0" encoding="ISO-8859-1"?>
<shiporder orderid="889923"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="shiporder.xsd">
<orderperson>John Smith</orderperson>
<shipto>
<name>Ola Nordmann</name>
<address>Langgt 23</address>
<city>4000 Stavanger</city>
<country>Norway</country>
</shipto>
<item>
<title>Empire Burlesque</title>
<note>Special Edition</note>
<quantity>1</quantity>
<price>10.90</price>
</item>
<item>
<title>Hide your heart</title>
<quantity>1</quantity>
<price>9.90</price>
</item>
</shiporder>

Page 18
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Session 06: SAX


Learning Objectives
After completing this session, you will be able to:

Describe SAX API

Explain Event Handlers

Introduction
SAX stands for Simple API for XML.As and when the SAX parser encounters an element in an
XML document, it generates an event and sends it to the application that are invoked in the parser
and the application can respond to the event appropriately. SAX does not read the entire XML
document into memory. It reads only a small chunk of document at a time, parses it, generates
events, and then reads another small chunk of document. Therefore it does not require large
amount of memory. A SAX parser is suitable for parsing huge XML documents. A DOM parser
reads the entire XML document into memory before parsing and therefore a DOM parser cannot
handle large documents. SAX parser can only read an XML document and retrieve it contents. It
cannot modify a DOM parser.
SAX parser is chosen over DOM parser when memory is a constraint or when it is required to read
the content of an XML only. It is available in several programming languages like Java, C++, Perl,
and python.
There are two major types of XML (or SGML) APIs:

Tree-based APIs: These map an XML document into an internal tree structure, and
then allow an application to navigate that tree. The Document Object Model (DOM)
working group at the World-Wide Web Consortium (W3C) maintains a
recommended tree-based API for XML and HTML documents, and there are many
such APIs from other sources.

Event-based APIs: An event-based API, on the other hand, reports parsing events
(such as the start and end of elements) directly to the application through callbacks,
and does not usually build an internal tree. The application implements handlers to
deal with the different events, much like handling events in a Graphical User
Interface. SAX is the best known example of such an API.

DocumentHandler Interface: The DocumentHandler interface defines events that occur in the
standard course of parsing a document or message. For the purposes of your implementation
within ASN.1, you are only interested in three of them:

Start Element: This event occurs when the parser moves into a new element. An
element in XML is defined to start when a <name> tag is encountered. The name of
the element is passed to the event handling callback function.

End Element: This event occurs when the parser leaves a given element space.
The end of an element in XML is signaled by a </name> tag. The name of the
element is once again passed to the event handling callback function.

Page 19
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Characters: This event occurs when character data (that is a value in name/value
pairs) is encountered. This event does not necessarily have to provide the entire
data value as a single event. The string can be broken up at the discretion of the
parser. It is up to the user to check for consecutive characters events and
concatenate the results to get the complete value. A pointer (or reference in Java) to
the character data along with a character count and offset is passed to the event
handler callback function.

Handling Events
The XML Handlers Module supports the following elements and attributes:
Element

Attributes

Minimal Content Model

Action

event (QName),
targetid (IDREF),
declare ("declare"),
xml:id ([XMLID])

( action | script |
dispatchEvent |
addEventListener |
removeEventListener |
stopPropagation |
preventDefault )+

Script

encoding (Charset),
src (URI),
type (ContentTypes),
xml:id ([XMLID])

PCDATA

dispatchEvent

raise (QName),
destid (IDREF),
bubbles ("bubbles"),
cancelable ("cancelable"),
xml:id ([XMLID])

EMPTY

addEventListener

event* (QName),
handler* (IDREF),
EMPTY
phase ("capture" | "default"*),
xml:id ([XMLID])

event* (QName),
handler* (IDREF),
EMPTY
removeEventListener
phase ("capture" | "default"*),
xml:id ([XMLID])

Action Element: The action element is used to group event handler elements (including other
action elements) that will act in sequence as handlers for an event.
Script Element: The script element contains or references scripts that may register one or
more event handlers for a document through a scripting language that is supported by the
implementation.

Page 20
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
Dispatch Event Element: The dispatchEvent element triggers the event identified by the
raise attribute. If the destid attribute is specified, then it names a specific element to which to
dispatch the event. Otherwise the event is just dispatched to the "document" to be handled by any
registered listener.
AddEventListener Element: This element allows the registration of a listener on a specific event.
The most important events are the start and end of the document, the start and end of elements,
and character data.
To find out about the start and end of the document, the client application implements the start
Document() and end Document() methods:
public void startDocument () {
System.out.println("Start document");
public void endDocument ()
{
System.out.println("End document");

}
}

The start and endDocument event handlers take no arguments. When the SAX driver finds the
beginning of the document, it will invoke the startDocument () method once and when it finds
the end, it will invoke the endDocument() method once.
The SAX driver will signal the start and end of elements in much the same way, except that it will
also pass some parameters to the startElement() and endElement ()methods:
public void startElement (String uri, String name,
String qName, Attributes atts)
{
if ("".equals (uri))
System.out.println("Start element: " + qName);
else
System.out.println("Start element: {" + uri + "}" + name);
}
public void endElement (String uri, String name, String qName)
{
if ("".equals (uri))
System.out.println("End element: " + qName);
else
System.out.println("End element:
{" + uri + "}" + name);
}
These methods print a message every time an element starts or ends, with any Namespace URI
(Uniform Resource Identifier) in braces before the element's local name. The qName contains the
raw XML 1.0 name, which you must use for all elements that do not have a namespace URI.

Summary
This session gives you an idea about SAX and event handlers.

Page 21
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
Test Your Understanding
1. What is SAX?
2. What are the different event handlers?

Exercises
The process XML file is as follows:
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

Count up how many authors are there in the bookstore.

Page 22
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Session 08: DOM


Learning Objectives
After completing this session, you will be able to:

Describe DOM API

Explain DOM Tree Navigation

Transform DOM Tree into XML

DOM API
The XML DOM (Document Object Model) defines a standard way for accessing and manipulating
XML documents. It views an XML tree as a data structure, similar to the DOM from Javascript.

DOM Tree Navigation


The XML DOM views an XML document as a tree structure. The tree structure is called a nodetree. All nodes can be accessed through the tree. Their contents can be modified or deleted, and
new elements can be created. The node tree shows the set of nodes, and the connections
between them. The tree starts at the root node and branches out to the text nodes at the lowest
level of tree.

Page 23
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Getting DOM Tree:


The xml.dom.ext.reader package contains a number of classes that build a DOM tree from
various input sources. One of the modules in the xml.dom package is named Sax2, and contains
a Reader class that builds a DOM tree from a series of SAX2 events. Reader instances provide a
fromStream() method that constructs a DOM tree from an input stream. The input can be a filelike object or a string. In the second case, it will be assumed to be a URL (Universal Resource
Locator) and will be opened with the urllib module.
u import sys
from xml.dom.ext.reader import Sax2
# create Reader object
reader = Sax2.Reader()
# parse the document
doc = reader.fromStream(sys.stdin)
The fromStream() method returns the root of a DOM tree constructed from the input XML
document.
Transforming XML into DOM Tree:
DOM Tree:
Element xbel None
Text #text ' \012 '
ProcessingInstruction processing 'instruction'
Text #text '\012 '
Element desc None
Text #text 'No description'
Text #text '\012 '
Element folder None
Text #text '\012
'
Element title None
Text #text 'XML bookmarks'
Text #text '\012
'
Element bookmark None
Text #text '\012
'
Element title None
Text #text 'SIG for XML Processing in Python'
Text #text '\012
'
Text #text '\012 '
Text #text '\012'

Page 24
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
XML File:
Element xbel None
Text #text ' \012 '
ProcessingInstruction processing 'instruction'
<?xml version="1.0" encoding="iso-8859-1"?>
<xbel>
<?processing instruction?>
<desc>No description</desc>
<folder>
<title>XML bookmarks</title>
<bookmark href="http://www.python.org/sigs/xml-sig/" >
<title>SIG for XML Processing in Python</title>
</bookmark>
</folder>
</xbel>
A DOM tree can be converted back to XML by using the Print(doc, stream) or
PrettyPrint(doc, stream) functions in the xml.dom.ext module. If stream is not provided,
then the resulting XML will be printed to standard output. Print() will simply render the DOM
tree without any changes, while PrettyPrint() will add or remove whitespace in order to nicely
indent the resulting XML.

Summary
This session provides an idea about DOM Parser.

Test Your Understanding


1. What is a DOM parser?
2. How to transform DOM tree to an XML file?

Exercises
Using DOM parser modify the author of CHILDREN catefory of the given XML file.
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>

Page 25
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

Page 26
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Session 10: JAXP


Learning Objectives
After completing this session, you will be able to:

Describe JAXP API

Define DocumentBuilder

Explain SAX Parser

Introduction:
The Java API for XML Processing, or JAXP (pronounced jaks-p), is one of the Java XML
programming APIs. It provides the capability of validating and parsing XML documents. The two
basic parsing interfaces are:

The Document Object Model parsing interface or DOM interface

The Simple API for XML parsing interface or SAX interface

JAXP is an API, but it is more accurately called an abstraction layer. It does not provide a new
means of parsing XML and also it does not add anything new to SAX interface or DOM interface.
JAXP makes it easier to use DOM and SAX to deal with some difficult tasks. JAXP is a standard
component in the Java platform.
DOM Interface: The DOM interface is perhaps the easiest to describe. It parses an entire XML
document and constructs a complete in-memory representation of the document using the classes
modeling the concepts found in the Document Object Model (DOM) Level 2 Core Specification.
The DOM parser is called a DocumentBuilder, as it builds an in-memory document
representation. The javax.xml.parsers.DocumentBuilder is created by the
javax.xml.parsers.DocumentBuilderFactory. The DocumentBuilder creates an
org.w3c.dom.Document instance, which is a tree structure containing nodes in the XML document.
Each tree node in the structure implements the org.w3c.dom.Node interface. There are many
different types of tree nodes, representing the type of data found in an XML document. The most
important node types are:

Element nodes, which may have attributes

Text nodes representing the text found between the start and end tags of a
document element

Using the DocumentBuilderFactory


import
import
import
import

java.io.File;
java.io.IOException;
java.io.OutputStreamWriter;
java.io.Writer;

// JAXP
import javax.xml.parsers.FactoryConfigurationError;

Page 27
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
// DOM
import
import
import
import
import

org.w3c.dom.Document;
org.w3c.dom.DocumentType;
org.w3c.dom.NamedNodeMap;
org.w3c.dom.Node;
org.w3c.dom.NodeList;

public class TestDOMParsing {


public static void main(String[] args) {
try {
if (args.length != 1) {
System.err.println ("Usage: java TestDOMParsing " +
"[filename]");
System.exit (1);
}
// Get Document Builder Factory
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
// Turn on validation, and turn off namespaces
factory.setValidating(true);
factory.setNamespaceAware(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new File(args[0]));
// Print the document from the DOM tree and
//
feed it an initial indentation of nothing
printNode(doc, "");
} catch (ParserConfigurationException e) {
System.out.println("The underlying parser does not " +
"support the requested features.");
} catch (FactoryConfigurationError e) {
System.out.println("Error occurred obtaining Document " +
"Builder Factory.");
} catch (Exception e) {
e.printStackTrace();

Page 28
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

}
}
private static void printNode(Node node, String indent)
// print the DOM tree
}

First, a DocumentBuilderFactory is obtained. Then the factory is configured to handle


validation and namespaces. Next, a DocumentBuilder instance, the analog to SAXParser, is
retrieved from the factory. Parsing can then occur, and the resultant DOM Document object is
handed off to a method that prints the DOM tree.
SAX Interface: The SAX parser is called the SAXParser and is created by the
javax.xml.parsers.SAXParserFactory. Unlike the DOM parser, the SAX parser does not
create an in-memory representation of the XML document and so is faster and uses less memory.
Instead, the SAX parser informs clients of the XML document structure by invoking callbacks, that
is, by invoking methods on an org.xml.sax.helpers.DefaultHandler instance provided to
the parser. The DefaultHandler class implements the ContentHandler, the ErrorHandler,
the DTDHandler, and the EntityResolver interfaces. Most clients will be interested in methods
defined in the ContentHandler interface, which are called when the SAX parser encounters the
corresponding elements in the XML document. The most important methods in this interface are:

startDocument() and endDocument() methods are called at the start and end
of an XML document.

startElement() and endElement() methods are called at the start and end of a
document element.

Characters() method that is called with the text data contents contained between
the start and end tags of an XML document element.

Clients provide a subclass of the DefaultHandler that overrides these methods and processes
the data. This may involve storing the data into a database or writing it out to a stream.

Summary
This session gives you an idea about JAXP and DocumentBuilder and as well as SAX Parser.

Test Your Understanding


1. How do you create SAX Factory?
2. How do you create Document Builder Factory?

Page 29
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Session 12: XPath


Learning Objectives
After completing this session, you will be able to:

Write XPath nodes and syntax

Define XPath axes and operators

XPath
XPath is a language for finding information in an XML document. XPath is used to navigate
through elements and attributes in an XML document.

What is XPath?

XPath is the syntax for defining parts of an XML document.

XPath uses path expressions to navigate in XML documents.

XPath contains a library of standard functions.

XPath is a major element in XSLT (eXtensible Stylesheet Language)

XPath is a W3C Standard.

Xpath Nodes
In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processinginstruction, comment, and document (root) nodes. XML documents are treated as trees of nodes.
The root of the tree is called the document node.
Atomic values: Atomic values are nodes with no children or parent.
Items: Items are atomic values or nodes.

Relationship of Nodes

Parent: Each element and attribute has one parent

Children: Element nodes may have zero, one or more children

Siblings: Nodes that have the same parent

Ancestors: Parent of a node or a parent, and so on

Descendants: Children of a node or other children, and so on

Page 30
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Selecting Nodes
XPath uses path expressions to select nodes in an XML document. The node is selected by
following a path or steps. The most useful path expressions are as follows:
Expression
nodename

Description
Selects all child nodes of the named node

Selects from the root node

//

Selects nodes in the document from the current node that match the selection
no matter where they are

Selects the current node

..

Selects the parent of the current node

Selects attributes

Predicates
Predicates are used to find a specific node or a node that contains a specific value. Predicates are
always embedded in square brackets.
Examples
In the following table you have listed some path expressions with predicates and the result of the
expressions:
Path Expression

Result

/bookstore/book[1]

Selects the first book element that is the child of the bookstore
element

/bookstore/book[last()]

Selects the last book element that is the child of the bookstore
element

//title[@lang]

Selects all the title elements that have an attribute named lang

//title[@lang='eng']

Selects all the title elements that have an attribute named lang
with a value of 'eng'

Selecting Several Paths


By using the | operator in an XPath expression you can select several paths.
Examples
In the following table you have listed some path expressions and the result of the expressions:
Path Expression

Result

//book/title | //book/price
//title | //price

Selects all the title and price elements of all book elements
Selects all the title and price elements in the document

/bookstore/book/title | //price Selects all the title elements of the book element in the
bookstore element and all the price elements in the document

Page 31
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
XPath Axes:
An axis defines a node-set relative to the current node.
AxisName

Result
Selects all ancestors (parent, grandparent, and so on) of the current node

ancestor

Selects all ancestors (parent, grandparent, and so on) of the current node and
the current node itself

ancestor-or-self

Selects all attributes of the current node

attribute

Selects all children of the current node

child

Selects all descendants (children, grandchildren, and so on) of the current node

descendant

descendant-or-self Selects all descendants (children, grandchildren, and so on) of the current node
and the current node itself
Selects everything in the document after the closing tag of the current node

following
following-sibling

Selects all siblings after the current node


Selects all namespace nodes of the current node

namespace

Selects the parent of the current node

parent

Selects everything in the document that is before the start tag of the current node

preceding
preceding-sibling

Selects all siblings before the current node


Selects the current node

self

XPath Operators:
An XPath expression returns either a node-set, a string, a Boolean, or a number. The list of the
operators that can be used in XPath expressions are:
Operator

Description

Example

Return value

Computes two node-sets

//book | //cd

Returns a node-set with


all book and cd
elements

Addition

6 + 4

10

Subtraction

6 - 4

Multiplication

6 * 4

24

div

Division

8 div 4

Equal

price=9.80

true if price is 9.80


false if price is 9.90

!=

Not equal

price!=9.80

true if price is 9.90


false if price is 9.80

<

Less than

price<9.80

true if price is 9.00


false if price is 9.80

<=

Less than or equal to

price<=9.80

true if price is 9.00


false if price is 9.90

Page 32
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
Operator

Description

Example

Return value

>

Greater than

price>9.80

true if price is 9.90


false if price is 9.80

>=

Greater than or equal to

price>=9.80

true if price is 9.90


false if price is 9.70

or

or

price=9.80 or
price=9.70

true if price is 9.80


false if price is 9.50

and

and

price>9.00 and
price<9.90

true if price is 9.80


false if price is 8.50

mod

Modulus (division remainder)

5 mod 2

Summary
This session provides an idea about XPath.

Test Your Understanding


1. What is XPath?
2. What are the different types of nodes in XPath?

Exercises
From the given XML file retrieve all the child elements using path expressions.
XML File:
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>

Page 33
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

Page 34
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Session 14: X Query


Learning Objectives
After completing this session, you will be able to:

Write XQuery terms and syntax

Perform XQuery selecting and filtering

XQuery
XQuery was designed to query XML data. XQuery is also known as XML Query.

What is XQuery?

XQuery is the language for querying XML data.

XQuery for XML is like SQL for databases.

XQuery is built on XPath expressions.

XQuery is supported by all the major database engines like IBM, Oracle, Microsoft,
and so on.

XQuery is a language for finding and extracting elements and attributes from XML
documents.

XQuery Terms: In XQuery, there are seven kinds of nodes, which are element, attribute, text,
namespace, processing-instruction, comment, and document (root) nodes. XML documents are
treated as trees of nodes. The root of the tree is called the document node.
Atomic values: Atomic values are nodes with no children or parent.
Items: Items are atomic values or nodes.

Relationship of Nodes
Parent: Each element and attribute has one parent
Children: Element nodes may have zero, one or more children
Siblings: Nodes that have the same parent
Ancestors: Parent of a node or a parent, and so on
Descendants: Children of a node or other children, and so on
XQuery syntax:

XQuery is case-sensitive

XQuery elements, attributes, and variables must be valid XML names

An XQuery string value can be in single or double quotes

An XQuery variable is defined with a $ followed by a name, for example $bookstore

XQuery comments are delimited by (: and :), for example (: XQuery Comment :)

Page 35
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
XQuery Comparisons
In XQuery there are two ways of comparing values, which are as follows:

General comparisons: =, !=, <, <=, >, >=

Value comparisons: eq, ne, lt, le, gt, ge

Selecting and Filtering Elements


The following example talks about different select and filter elements.
for $x in doc("books.xml")/bookstore/book
where $x/price>30
order by $x/title
return $x/title

for: (optional) binds a variable to each item returned by the in expression

let: (optional)

where: (optional) specifies a criteria

order by: (optional) specifies the sort-order of the result

return: specifies what to return in the result

The for clause: The for clause binds a variable to each item returned by the in expression. The
for clause results in iteration. There can be multiple for clauses in the same FLWOR.
expression.
To loop a specific number of times in a for clause, you may use the to keyword:
for $x in (1 to 5)
return <test>{$x}</test>

The at keyword can be used to count the iteration:


for $x at $i in doc("books.xml")/bookstore/book/title
return <book>{$i}. {data($x)}</book>

The let clause: The let clause allows variable assignments and it avoids repeating the same
expression many times. The let clause does not result in iteration.
let $x := (1 to 5)
return <test>{$x}</test>

The where clause: The where clause is used to specify one or more criteria for the result.
where $x/price>30 and $x/price<100

Page 36
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
The order by clause: The order by clause is used to specify the sort order of the result. Here
you want to order the result by category and title.
for $x in doc("books.xml")/bookstore/book
order by $x/@category, $x/title
return $x/title

The return clause: The return clause specifies what is to be returned.


for $x in doc("books.xml")/bookstore/book
return $x/title

Summary
You have described that XQuery was designed to query anything that can appear as XML,
including databases. You have also described how to query the XML data with FLWOR
expressions, and how to construct XHTML (eXtensible Hypertext Markup Language) output from
the collected data.

Test Your Understanding


1. What is XQuery?
2. What are the different types of nodes in XQuery?

Exercises
Using XQuery retrieve the child elements from the following XML file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>

Page 37
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

Page 38
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Session 16: XSLT


Learning Objectives
After completing this session, you will be able to:

Define XSL

Identify XSLT elements

XSL
XSL stands for eXtensible Stylesheet Language.

What is XSLT?

XSLT stands for XSL Transformations

XSLT is the most important part of XSL

XSLT transforms an XML document into another XML document

XSLT uses XPath to navigate in XML documents

XSLT is compatible with all browsers

The root element that declares the document to be an XSL stylesheet is


<xsl:stylesheet> or <xsl:transform>

The correct way to declare an XSL stylesheet according to the W3C XSLT Recommendation is:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

The <xsl:template> element


The <xsl:template> element is used to build templates.
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<tr>
<td>.</td>

Page 39
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
<td>.</td>
</tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
As an XSL stylesheet is an XML document itself, it always begins with the XML declaration: <?xml
version="1.0" encoding="ISO-8859-1"?>.
The next element, <xsl:stylesheet>, defines that this document is an XSLT stylesheet
document (along with the version number and XSLT namespace attributes).
The <xsl:template> element defines a template. The match="/" attribute associates the
template with the root of the XML source document.
The content inside the <xsl:template> element defines some HTML to write to the output.
The last two lines define the end of the template and the end of the style sheet.

The <xsl:value-of> element


The <xsl:value-of> element can be used to extract the value of an XML element and add it to
the output stream of the transformation:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<tr>
<td><xsl:value-of select="catalog/cd/title"/></td>
<td><xsl:value-of select="catalog/cd/artist"/></td>
</tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

Page 40
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
The value of the select attribute is an XPath expression. An XPath expression works like
navigating a file system; where a forward slash (/) selects subdirectories.

The <xsl:for-each> element


The XSL <xsl:for-each> element can be used to select every XML element of a specified
node-set:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

To sort the output, simply add an <xsl:sort> element inside the <xsl:for-each> element in
the XSL file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>

Page 41
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout
<xsl:for-each select="catalog/cd">
<xsl:sort select="artist"/>
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

The <xsl:choose> element


The <xsl:choose> element is used in conjunction with <xsl:when> and <xsl:otherwise> to
express multiple conditional tests.
Syntax:
<xsl:choose>
<xsl:when test="expression">
... some output ...
</xsl:when>
<xsl:otherwise>
... some output ....
</xsl:otherwise>
</xsl:choose>

Summary
You have explained how to use XSLT to transform XML documents into other formats, like
XHTML. You have explained how to add or remove elements and attributes to or from the output
file. You have also explained how to rearrange and sort elements, perform tests, and make
decisions about which elements to hide and display.

Test Your Understanding


1. What is XSLT?
2. What are the different elements of XSLT?

Page 42
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Exercises
Transform the given XML document into XSL document.
XML File:
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

Page 43
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

Glossary
API:
DOM:
DTD:
JAXP:
SAX :
URI:
XML:
XSD:
XSL:
XSLT:

Application Programming Interface


Document Object Model
Document Type Definition
Java API for XML Processing
Simple API for XML
Uniform Resource Identifier
eXtensible Markup Language
XML Schema Definition
eXtensible Stylesheet Language
eXtensible Stylesheet Language Transformations

Page 44
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

References

Websites

www.w3.org

www.xml.org

www.xml.com

www.w3schools.com/xml/

XML Schema By Eric van der Vlist

Learning XML By Erik T. Ray

Books

Page 45
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

XML - Handout

STUDENT NOTES:

Page 46
Copyright 2007, Cognizant Technology Solutions, All Rights Reserved
C3: Protected

Вам также может понравиться