Академический Документы
Профессиональный Документы
Культура Документы
Topic 3
Module SE-C-03
XML
Supported by:
Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP-18035-2003
Version: April 28, 2006
A bit of History
1989 - "Information Management: A Proposal" is written and circulated by Tim Berners-Lee of CERN (European Laboratory for Particle Physics). He proposed a hypertext system including HTML and HTTP
1990 - Berners-Lee proposal is reformulated and the name World Wide Web (WEB, WWW) is coined.
1993 - Marc Andreessen unleashed the alpha version of Mosaic
Pre-XML: HTML
Problems with HTML
primarily presentation hard to derive meaning from the markup fixed tag set static
Pre-XML: SGML
SGML - Standard Generalized Markup Language
Working standards draft 1980 Allow text editing, formatting, and information retrieval systems to share documents
What Is XML?
XML stands for Extensible Markup Language (often written as eXtensibleMarkup Language to justify the acronym). Goal: combine the power of SGML (extensibility) with the simplicity of HTML
1998: XML 1.0 standard published XML is a set of rules for defining semantic tags that break a document into parts and identify the different parts of the document. It is a meta-markup language that defines a syntax used to define other domain-specific, semantic, structured markup languages. Its value as a data interchange language quickly became evident
5
HTML - example
<DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 // EN> <HTML> <HEAD> <TITLE> Begining ASP 3.0 </TITLE> </HEAD> <BODY> <B> Begining ASP 3.0 </B> <H3> ISBN 1-861003-38-2</H3> <H4>Authors> </H4> <H4>Brian Frencis, Chris Ullman, Dave Sussman, John Kauffman> </H4> <P> US $49.99 <BR> <P> ASP je napredna tehnika za dinamicko kreiranje sadrzaja Web sajta. </P> </BODY> </HTML>
SE-C-05 System Integration 6
XML - example
<?xml version =1.0?> <books> <book> <title> Begining ASP 3.0 </title> <ISBN> ISBN 1-861003-38-2</ISBN> <authors> <author_name> Brian Frencis </author_name> <author_name> Chris Ullman </author_name> <author_name> Dave Sussman </author_name> <author_name> John Kauffman </author_name> </authors> <description> Server side scripting technologies</description> <price US $49.99>/ </prace> </book> </books>
SE-C-05 System Integration 7
XML, however, is a meta-markup language. Its a language in which you make up the tags you need as you go along.
These tags must be organized according to certain general principles, but theyre quite flexible in their meaning. You dont have to force your data to fit into paragraphs, list items, strong emphasis, or other very general categories.
Advantages of XML
Instead of generic tags like <dt> and <li>, this listing uses meaningful tags like <BOOKS>, <BOOK>, <AUTHORS>, and <ISBN>. This has a number of advantages, including that its easier for a human to read the source code to determine what the author intended. XML markup also makes it easier for non-human automated robots to locate all of the books in the document. In HTML robots cant tell more than that an element is a dt. They cannot determine whether that dt represents a song title, a definition, or just some designers favorite means of indenting text. In fact, a single document may well contain dt elements with all three meanings.
SE-C-05 System Integration 10
Self-Describing Data
At a higher level, XML is self-describing. Suppose youre an information archaeologist in the 23rd century and you encounter this chunk of XML code on an old floppy disk that has survived the ravages of time:
<PERSON ID=p1100 SEX=M> <NAME> <GIVEN>Judson</GIVEN> <SURNAME> McDaniel</SURNAME> </NAME> <BIRTH> <DATE>21 Feb 1834</DATE> </BIRTH> <DEATH> <DATE>9 Dec 1905</DATE> </DEATH> </PERSON>
12
Plain Text
Since XML is not a binary format, you can create and edit files with anything from a standard text editor to a visual development environment. That makes it easy to debug your programs, and makes it useful for storing small amounts of data. An XML front end to a database makes it possible to efficiently store large amounts of XML data as well. So XML provides scalability for anything from small configuration files to a company-wide data repository.
13
Data Identification
XML tells you what kind of data you have, not how to display it. Because the markup tags identify the information and break up the data into parts, an email program can process it, a search program can look for messages sent to particular people, and an address book can extract the address information from the rest of the message. Because the different parts of the information have been identified, they can be used in different ways by different applications.
15
16
Hierarchical
XML documents benefit from their hierarchical structure. Hierarchical document structures are, in general, faster to access because you can drill down to the part you need, like stepping through a table of contents. They are also easier to rearrange, because each piece is delimited. In a document, for example, you could move a heading to a new location and drag everything under it along with the heading, instead of having to page down to make a selection, cut, and then paste the selection into a new location.
SE-C-05 System Integration 17
18
20
23
Usage of XML
XMLHTTP XML-Tekstdatoteka Transformacija u isti HTML ADO-2.1 datoteka Transformacija u HTML sa ostrvima podataka Transformacija u prozvoljni format ADO-2.1ODBC-Poziv
24
Usage of XML
HTML view#1 HTML view#2
XML
Server XML received from other application
MF
DB
25
26
Fundamental concepts
Supported by:
Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP-18035-2003
Version: April 28, 2006
XML Documents
The precise meaning of XML document is defined by the XML 1.0 specification published by the Worldwide Web Consortium (W3C). This specification provides a detailed BNF grammar defining exactly what is and is not an XML document. Anything that satisfies the document production in that BNF grammar and adheres to the fifteen well-formedness constraints is an XML document. Anything that does not is not an XML document. Well-formedness is the minimum requirement for an XML document. A document that is not well-formed is not an XML document. Parsers cannot read it.
28
29
30
31
<a href="newpage.html"> XML is case sensitive: <TAG> and <Tag> are treated differently. (Standard: use lower case.)
SE-C-05 System Integration 32
More Rules
A document begins with:
an XML Declaration <?xml version="1.0" encoding="UTF-8"?> and a DocType Declaration: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
A tree diagram
Person
Name
Profession
Computer
Profession
mathematician
Profession
cryptographer
scientist
35
Narrative documents
XML can also be used for more free-form, narrative documents such as business reports, magazine articles, student essays, short stories, web pages, and so forth, as shown by following example:
<biography> <name> <first_name>Alan</first_name> <last_name>Turing</last_name> </name> was one of the first people to truly deserve the name <emphasize>computer scientist</emphasize>. Although his contributions to the field are too numerous to list, his best-known are the eponymous <emphasize>Turing Test</emphasize> and <emphasize>Turing Machine</emphasize>. <definition>The <term>Turing Test</term> is to this day the standard test for determining whether a computer is truly intelligent. This test has yet to be passed. </definition>
SE-C-05 System Integration 36
<definition>The <term>Turing Machine</term> is an abstract finite state automaton with infinite memory that can be proven equivalent to any any other finite state automaton with arbitrarily large memory. Thus what is true for a Turing machine is true for all equivalent machines no matter how implemented. </definition> <name> <last_name>Turing</last_name> </name> was also an accomplished <profession>mathematician</profession> and <profession>cryptographer</profession> His assistance was crucial in helping the Allies decode the German Enigma machine. He committed suicide on <date> <month>June</month> <day>7</day>, <year>1954</year> </date> after being convicted of homosexuality and forced to take female hormone injections. </biography>
37
XML Applications
XML applications limit the very flexible rules of XML to a finite set of elements of certain types. For example, DocBook is an XML application designed for technical manuscripts. Elements it defines include book, chapter, para, sect1, sect2, programlisting, and several hundred others. When writing a DocBook document, you have to use these elements; and you have to use them in certain ways. For instance, a sect2 element can be a child of a sect1 but not a child of a sect3 or a chapter. Scalable Vector Graphics (SVG) is an XML application for line art. Elements it defines include line, circle, ellipse, polygon, polyline, and so forth. All SVG documents are XML documents, but not all XML documents are SVG documents. An XML application can have a schema that defines what is and is not a legal document for that application. Schemas can be written in a variety of languages including Document Type Definitions (DTDs), the W3C XML Schema Language, RELAX NG, Schematron, and numerous others
SE-C-05 System Integration 38
39
</ShipTo>
SE-C-05 System Integration 40
41
Attributes
Attributes are name value pairs associated with elements <Subtotal currency='USD'> 393.85 </Subtotal> Attributes are unordered. There is no difference between these two elements:
<Tax rate="7.0" currency="USD">27.57</Tax> <Tax currency="USD" rate="7.0">27.57</Tax>
42
Attributes
<person>
<name first="Alan" last="Turing"/> <profession value="computer scientist"/> <profession value="mathematician"/> <profession value="cryptographer"/>
</person>
43
XML-data model
Document Atributes
Element
XML is transformed in the tree with elements as nodes and values of the elements as lifs.
SE-C-05 System Integration 44
XML Example
<car type =auto year=2001> <producer> Opel </producer> <model> Astra </model>
<price/>
</car>
45
type
auto 2001
car price
year
producer
model
Opel
SE-C-05 System Integration
Astra
46
<first_name>Alan</first_name>
<last_name>Turing</last_name> </person> was one of the first people to truly deserve the name <emphasize>computer scientist</emphasize>. Although his contributions to the field were too numerous to list, his best-known are the eponymous <emphasize xlink:type="simple" xlink:href="http://cogsci.ucsd.edu/~asaygin/tt/ttest.html">Turing Test</emphasize> and <emphasize xlink:type="simple" xlink:href="http://mathworld.wolfram.com/TuringMachine.html"> Turing Machine</emphasize>. <last_name>Turing</last_name> was also an accomplished <profession>mathematician</profession> and 47 <profession>cryptographer</profession>. His assistance was crucial in SE-C-05 System Integration
XML Declaration
Most XML documents begin with an XML declaration
48
Comments
<!-- Please make sure this order goes out ASAP! -->
49
Processing Instructions
Processing instructions are used to tell particular software how it should handle an XML document after the document has been parsed. Generally, processing instructions are used for metainformation that may apply to documents from many different domains and XML vocabularies. For instance, the most common processing instruction, xml-stylesheet, tells a browser or other formatter where it can find the stylesheet it should apply to the document.
Processing Instructions
<?php mysql_connect("database.unc.edu", "clerk", "password"); $result = mysql("HR", "SELECT LastName, FirstName FROM Employees ORDER BY LastName, FirstName"); $i = 0; while ($i < mysql_numrows ($result))
{ $fields = mysql_fetch_row($result);
echo "<person>$fields[1] $fields[0] </person>\r\n"; $i++; }
mysql_close( );
?>
51
52
Supported by:
Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP-18035-2003
Version: April 28, 2006
55
56
57
58
59
60
]>
63
Element Declarations
Basic form of element declaration: <!ELEMENT element_name content_specification> Example of element which contain parsed character data, but not contain any child elements of any type: <!ELEMENT phone_number (#PCDATA)> Element with one child element: <!ELEMENT fax (phone_number)> Element with sequence of child elements: <!ELEMENT name (first_name, last_name)>
SE-C-05 System Integration 64
Choices
Sometimes one instance of an element may contain one kind of child, and another instance may contain a different child. This can be indicated with a choice.
Examples:
<!ELEMENT methodResponse (params | fault)> <!ELEMENT digit (zero | one | two | three | four | five | six | seven | eight | nine) > <!ELEMENT circle (center, (radius | diameter))> <!ELEMENT name (last_name | (first_name, ( (middle_name+, last_name) | (last_name?) ) ) >
66
Mixed Content
Examples:
<!ELEMENT definition (#PCDATA | term)*> <!ELEMENT paragraph (#PCDATA | name | profession | footnote | emphasize | date )* >
67
Empty Element
<!ELEMENT image EMPTY> Valid examples:
<image source="bus.jpg" width="152" height="345" alt="Alan Turing standing in front of a bus" /> <image source="bus.jpg" width="152" height="345" alt="Alan Turing standing in front of a bus"></image>
ANY
<!ELEMENT page ANY>
This declaration says that a page element can contain any content including mixed content, child elements, and even other page elements. The children that actually appear in the page elements' content in the document must still be declared in element declarations of their own. ANY does not allow you to use undeclared elements.
69
Attribute Declarations
As well as declaring its elements, a valid document must declare all the elements' attributes. This is done with ATTLIST declarations. A single ATTLIST can declare multiple attributes for a single element type. <!ATTLIST image source CDATA #REQUIRED> <!ATTLIST image source CDATA #REQUIRED width CDATA #REQUIRED height CDATA #REQUIRED alt CDATA #IMPLIED >
Examples:
70
Attribute Types
CDATA NMTOKEN NMTOKENS Enumeration ENTITY ENTITIES ID IDREF IDREFS NOTATION
SE-C-05 System Integration 71
XML schema
Supported by:
Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP-18035-2003
Version: April 28, 2006
XML schema
An XML schema is an XML document containing a formal description of what comprises a valid XML document. A W3C XML Schema Language schema is an XML schema written in the particular syntax recommended by the W3C.
74
However, DTDs do not provide fine control over the format and data types of element and attribute values.
75
76
Schema Basics
Example shows a very simple well-formed XML document. Example addressdoc.xml <?xml version="1.0"?> <fullName>Scott Means</fullName> Assuming that the fullName element can only contain a simple string value, the schema for this document would look like: Example address-schema.xsd <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="fullName" type="xs:string"/> </xs:schema>
SE-C-05 System Integration 77
78
Supported by:
Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP-18035-2003
Version: April 28, 2006
80
The Document Object Model (DOM), is the most common treebased API. JDOM and DOM4J are Java-only alternatives.
84
XML APIs
XML processors make the structure and contents of XML documents available to applications through APIs
Event-based APIs
notify application through parsing events e.g., the SAX call-back interfaces
Supported by:
Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP-18035-2003
Version: April 28, 2006
Level 1, W3C Rec, Oct. 1998 Level 2, W3C Rec, Nov. 2000 Level 3, W3C Working Draft (January 2002)
89
The Document Object Model (DOM) is a language- and platform-independent object framework for manipulating structured documents The DOM structures a document as a hierarchy of Node objects.
The Node interface is the base interface for every member of a DOM document tree. It exposes attributes common to every type of document object and provides a few simple methods to retrieve type-specific information.
This interface also exposes all methods used to query, insert, and remove objects from the document hierarchy.
The Node interface makes it easier to build general- purpose tree-manipulation routines that are not dependent on specificdocument element types.
SE-C-05 System Integration 90
91
<invoice> <invoicepage form="00" type="estimatedbill"> <addressee> <addressdata> <name> Tijana Petrovic </name> <address> <streetaddress> Beogradska 14 </streetaddress> <postoffice>18000 NIS </postoffice> </address> </addressdata> </addressee> ...
Document
Element
Tijana Petrovic
Text
streetaddress
Beogradska 14
postoffice
18000 NIS
92
Atributes
93
DOM Level 2
Level 1: basic representation and manipulation of document structure and content (No access to the contents of a DTD) support for namespaces accessing elements by ID attribute values optional features
interfaces to document views and style sheets an event model (for, say, user actions on elements) methods for traversing the document tree and manipulating regions of document (e.g., selected by the user of an editor) Loading and writing of docs not specified (-> Level 3)
94
95
DocumentType EntityReference
Notation
Entity
ProcessingInstruction
96
getNodeType getNodeValue getOwnerDocument getParentNode hasChildNodes getChildNodes getFirstChild getLastChild getPreviousSibling getNextSibling hasAttributes getAttributes appendChild(newChild) insertBefore(newChild,refChild) replaceChild(newChild,oldChild) removeChild(oldChild)
http://java.sun.com/webservices/jaxp/dist/1.1/docs/api/org/w3c/dom/Node.html
SE-C-05 System Integration 97
98
99
Node.getNodeValue()
content of a text node, value of attribute, ; null for an Element (!!) (in XSLT/Xpath: the full textual content)
Node.getNodeType():
numeric constants (1, 2, 3, , 12) for ELEMENT_NODE, ATTRIBUTE_NODE,TEXT_NODE, , NOTATION_NODE
100
Accessing a specific node, or iterating over all nodes of a NodeList: E.g. Java code to process all children:
for (i=0; i<node.getChildNodes().getLength(); i++) process(node.getChildNodes().item(i));
http://java.sun.com/webservices/jaxp/dist/1.1/docs/api/org/w3c/dom/package-summary.html
SE-C-05 System Integration 102
DOM: Implementations
Java-based parsers e.g. IBM XML4J, Apache Xerces, Apache Crimson
MS IE5 browser: COM programming interfaces for C/C++ and MS Visual Basic, ActiveX object programming interfaces for script languages XML::DOM (Perl implementation of DOM Level 1)
Others? Non-parser-implementations? (Participation of vendors of different kinds of systems in DOM WG has been active.)
103
A Java-DOM Example
A stand-alone toy application BuildXml
either creates a new db document with two person elements, or adds them to an existing db document
Technical basis
DOM support in Sun JAXP native XML document initialisation and storage methods of the JAXP 1.1 default parser (Apache Crimson)
104
105
try { // to get a new DocumentBuilder: documentBuilder builder = factory.newDocumentBuilder(); if (!docFile.exists()) { //create new doc document = builder.newDocument(); // add a comment: Comment comment = document.createComment( "A simple personnel list"); document.appendChild(comment); // Create the root element: root = document.createElement("db"); document.appendChild(root);
107
or if docFile already exists: } else { // access an existing doc try { // to parse docFile document = builder.parse(docFile); root = document.getDocumentElement(); } catch (SAXException se) { System.err.println("Error: " + se.getMessage() ); System.exit(1); } /* A similar catch for a possible IOException */
SE-C-05 System Integration 108
}
109
110
111
112
Supported by:
Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP-18035-2003
Version: April 28, 2006
http://www.brics.dk/~amoeller/XML/programming/saxapi.html
What is SAX?
Simple API for XML Originally developed through the xml-dev mailing list after Peter got bored of working numerous noninterchangeable XML parsers Primarily a Java API but there implementations in most languages Unfortunately they differ quite a lot So you will need to get a feeling for your particular implementation The full specification is not so 'simple' But a useful application usually only requires a small subset of SAX Currently at version 2.0 Version 2.0 was needed to provide support for namespaces
SE-C-05 System Integration 114
115
116
<metadataList>
<metadata name=age value=27/> <metadata name=colour value=blue/> </metadataList> <property title=bigness>
118
</array>
-------------------> endElement
119
saxexample.html xmlfile
Example?
import java.io.*; import org.xml.sax.*; import org.xml.sax.helpers.*; import org.apache.xerces.parsers.SAXParser; public class Flour extends DefaultHandler { float amount = 0; public void startElement(String namespaceURI, String localName, String qName, Attributes atts) { if (namespaceURI.equals("http://recipes.org") && localName.equals("ingredient")) { String n = atts.getValue("","name"); if (n.equals("flour")) { String a = atts.getValue("","amount"); // assume 'amount' exists amount = amount + Float.valueOf(a).floatValue(); } } } public static void main(String[] args) { Flour f = new Flour(); SAXParser p = new SAXParser(); p.setContentHandler(f); try { p.parse(args[0]); } catch (Exception e) {e.printStackTrace();} System.out.println(f.amount); } }
SE-C-05 System Integration 120
Saxevents.htm
Events in example
start document processing instruction: dsd starting element: collection -character data, length 3 -starting element: description --character data, length 47 -end element: description -character data, length 3 -starting element: recipe --character data, length 5 ... -end element: recipe -character data, length 1 end element: collection end document
SE-C-05 System Integration 121
SAX 2 Interfaces
Defines interfaces for standard routines and callbacks ContentHandler the most important interface Attributes Interface the second most important
interface
ContentHandler Interface
This is the bit that handles the most important events
The methods that handle the events are referred to as callback routines The parser fires events according to what it finds in the XML file. Every times it encounters an event it calls the appropriate callback routine
123
ContentHandler Interface
The most important piece of SAX startDocument() endDocument() startElement(uri, localName, qName, attrs) endElement(uri, localName, qName) characters(text, start, length) ignorableWhitespace(text, start, length) startPrefixMapping(prefix, uri) endPrefixMapping(prefix) processingInstruction(target, data) setDocumentLocator(locator) skippedEnitity(name)
SE-C-05 System Integration 124
Attributes Interface
Specifies methods for accessing individual attributes An attributes object is passed to the startElement routine The order of the attributes is unimportant and need not be in the same order as in the XML document. However we can refer to attributes by their index for convenience Uses overloaded functions allowing us to refer to an attribute by it's qualified name Or by its URI and it local name Or by an index (for convenience)
125
Attributes Interface
getLength () getQName(index) getURI(index) getType(uri, localName) getType(qualifiedName) getType(index)
getLocalName(index)
getIndex(uri, localPart)
getValue(uri, localName)
getValue(qualifiedName)
getIndex(qualifiedName)
getValue(index)
126
ErrorHandler Interface
ErrorHandler Allows you to catch errors and deal with them appropriately Again you have to write these functions The ErrorHandler only specifies the interface warning(exception) ambiguities/non-XML errors error (exception) non fatal errors (invalid documents) fatalError(exception) fatal errors (not well-formed)
127
Cons
A document is intuitive an event is less so There is no default storage model Because SAX only stores a small part of the document in memory at any given time, it is up to you to keep track of where you are in the document If the document has a lot of structure, and latter events need to know about earlier events you can find yourself storing a lot of data in memory
128
Supported by:
Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP-18035-2003
Version: April 28, 2006
Namespaces?
Since element names in XML are not predefined, a name conflict will occur when two different documents use the same element names. Namespaces are a simple and straightforward way to distinguish names used in XML documents, no matter where they come from. <table> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table>
131
Using Namespaces
<h:table xmlns:h="http://www.w3.org/TR/html4/"> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> <f:table xmlns:f="http://www.w3schools.com/furniture"> <f:name>Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table>
SE-C-05 System Integration 132
Default Namespaces
<table xmlns="http://www.w3.org/TR/html4/"> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table>
<h:html xmlns:xdc="http://www.xml.com/books" xmlns:h="http://www.w3.org/HTML/1998/html4"> <h:head> <h:title>Book Review</h:title></h:head> <h:body> <xdc:bookreview> <xdc:title>XML: A Primer</xdc:title> <h:table> <h:tr align="center"> <h:td>Author</h:td> <h:td>Price</h:td> <h:td>Pages</h:td> <h:td>Date</h:td> </h:tr> <h:tr align="left"> <h:td><xdc:author>Simon St.Laurent</xdc:author></h:td> <h:td><xdc:price>31.98</xdc:price></h:td> <h:td><xdc:pages>352</xdc:pages></h:td> <h:td><xdc:date>1998/01</xdc:date></h:td> </h:tr> </h:table> </xdc:bookreview> </h:body> 134 </h:html> SE-C-05 System Integration
Example
<h:html xmlns:xdc="http://www.xml.com/books" xmlns:h="http://www.w3.org/HTML/1998/html4"> <h:head><h:title>Book Review</h:title></h:head> <h:body> <xdc:bookreview> <xdc:title h:style="font-family: sans-serif;"> XML: A Primer</xdc:title> <h:table> <h:tr align="center"> <h:td>Author</h:td> <h:td>Price</h:td> <h:td>Pages</h:td> <h:td>Date</h:td> </h:tr> <h:tr align="left"> <h:td> <xdc:author>Simon St. Laurent</xdc:author> </h:td> <h:td><xdc:price>31.98</xdc:price></h:td> <h:td><xdc:pages>352</xdc:pages></h:td> <h:td><xdc:date>1998/01</xdc:date></h:td> </h:tr> </h:table> </xdc:bookreview> </h:body> </h:html>
SE-C-05 System Integration 135
<html xmlns="http://www.w3.org/HTML/1998/html4" xmlns:xdc="http://www.xml.com/books"> <head><title>Book Review</title></head> <:body> <xdc:bookreview> <xdc:title>XML: A Primer</xdc:title> <table> <tr align="center"> <td>Author</td> <td>Price</td> <td>Pages</td> <td>Date</td></tr> <tr align="left"> <td><xdc:author>Simon St. Laurent</xdc:author></td> <td><xdc:price>31.98</xdc:price></td> <td><xdc:pages>352</xdc:pages></td> <td><xdc:date>1998/01</xdc:date></td> </tr> </table> </xdc:bookreview> </body> </html>
SE-C-05 System Integration 136
Supported by:
Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP-18035-2003
Version: April 28, 2006
138
139
XSL: example
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
DOCTYPE declaration
</xsl:stylesheet>
SE-C-05 System Integration 140
XSL: example
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <xsl:template match="/"> <xsl:apply-templates/> </xsl:template>
Find root of DOM tree and apply templates
</xsl:stylesheet>
SE-C-05 System Integration 141
XSL: example
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <xsl:template match="/"> <xsl:apply-templates/>
</xsl:template>
XSL: example
<xsl:template match="clrcstructures"> <xsl:apply-templates/> </xsl:template>
XSL: example
... <xsl:template match="department"> <P><xsl:value-of select="deptabbrev"/> (<xsl:value-of select="deptname"/>)</P>
<UL><xsl:apply-templates/></UL>
</xsl:template>
144
XSL: example
...
<xsl:template match="group"> <P> <xsl:choose> <xsl:when test = "structureID [ . = 'ITDISEW3G']" > <B><xsl:value-of select="grpname"/></B> </xsl:when> <xsl:otherwise> <xsl:value-of select="grpname"/> Match W3G and </xsl:otherwise> display differently </xsl:choose> </P> </xsl:template>
Example
XML file: people.xml XSLT file: people.xsl Formated XML: peoplexsl.xml
146
name
StoreBook
phone
sid
SE-C-05 System Integration
title
bid
147
148
149
Oracle also has a utility, called XSQL pages, that allows you to embed SQL statements in a skeletal XML document. A request from a browser to this document is directed to a servlet, which executes the SQL statements and enters the results into the page before delivering it back to the browser. Formatting of the page can then be controlled on the client side using either CSS or client-side XSLT.
SE-C-05 System Integration 150
151
CONCLUSIONS
Supported by:
Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP-18035-2003
Version: April 28, 2006
153
XML...
Can be pre-generated or created on-the-fly at the server Provides an easily parsable, platform and vendor neutral format for transmitting data Needs no network etc support beyond the Web browser (or other transport) Provides the means to validate and transform the data at the desktop
154
Data validation
Even in a perfect world there can be problems in:
Generation Transmission Editing/processing after reception
155
Metadata - internal
Basic provided by Document Type Definitions (DTDs)
Simplified from SGML version Provides basic structure and cardinality
156
157
158
What is a mediator ?
A complex software component that integrates and transforms data from one or several sources using a declarative specification Two main contexts: Data conversion: converts data between two different models
CONCLUSION
XML is now achieving momentum The scientific data management community should be at the forefront of its use.
users will demand it advantages of widely available tools advantages in integration advantages in information management
160
Sources
http://sax.sourceforge.net/ - Official Sax Web site http://www.xml.com/pub/a/1999/01/na mespaces.html - Site with tutorial related to namespaces
161