Вы находитесь на странице: 1из 27

Objectives

ƒ How XML may be manipulated from general-


An Introduction to XML and Web Technologies purpose programming languages
ƒ How streaming may be useful for handling large
XML Programming documents

Anders Møller & Michael I. Schwartzbach


© 2006 Addison-Wesley
An Introduction to XML and Web Technologies 2

General Purpose XML Programming The JDOM Framework

ƒ Needed for: ƒ An implementation of generic XML trees in Java


• domain-specific applications ƒ Nodes are represented as classes and interfaces
• implementing new generic tools

ƒ DOM is a language-independent alternative


ƒ Important constituents:
• parsing XML documents into XML trees
• navigating through XML trees
• manipulating XML trees
• serializing XML trees as XML documents

An Introduction to XML and Web Technologies 3 An Introduction to XML and Web Technologies 4

1
JDOM Classes and Interfaces A Simple Example

ƒ The abstract class Content has subclasses: int xmlHeight(Element e) {


java.util.List contents = e.getContent();
• Comment java.util.Iterator i = contents.iterator();
int max = 0;
• DocType while (i.hasNext()) {
• Element Object c = i.next();
int h;
• EntityRef if (c instanceof Element)
• ProcessingInstruction h = xmlHeight((Element)c);
else
• Text h = 1;
ƒ Other classes are Attribute and Document if (h > max)
max = h;
ƒ The Parent interface describes Document and }

Element return max+1;


}

An Introduction to XML and Web Technologies 5 An Introduction to XML and Web Technologies 6

Another Example A Final Example (1/3)

static void doubleSugar(Document d)


throws DataConversionException {
ƒ Modify all elements like
Namespace rcp = <ingredient name="butter" amount="0.25" unit="cup"/>
Namespace.getNamespace("http://www.brics.dk/ixwt/recipes");
Filter f = new ElementFilter("ingredient",rcp); into a more elaborate version:
java.util.Iterator i = d.getDescendants(f); <ingredient name="butter">
while (i.hasNext()) { <ingredient name="cream" unit="cup" amount="0.5" />
Element e = (Element)i.next();
<preparation>
if (e.getAttributeValue("name").equals("sugar")) {
Churn until the cream turns to butter.
double amount = e.getAttribute("amount").getDoubleValue();
e.setAttribute("amount",new Double(2*amount).toString()); </preparation>
} </ingredient>
}
}

An Introduction to XML and Web Technologies 7 An Introduction to XML and Web Technologies 8

2
A Final Example (2/3) A Final Example (3/3)

void makeButter(Element e) throws DataConversionException { Element cream = new Element("ingredient",rcp);


Namespace rcp = cream.setAttribute("name","cream");
Namespace.getNamespace("http://www.brics.dk/ixwt/recipes"); cream.setAttribute("unit",c.getAttributeValue("unit"));
java.util.ListIterator i = e.getChildren().listIterator(); double amount = c.getAttribute("amount").getDoubleValue();
while (i.hasNext()) { cream.setAttribute("amount",new Double(2*amount).toString());
Element c = (Element)i.next(); butter.addContent(cream);
if (c.getName().equals("ingredient") && Element churn = new Element("preparation",rcp);
c.getAttributeValue("name").equals("butter")) { churn.addContent("Churn until the cream turns to butter.");
Element butter = new Element("ingredient",rcp); butter.addContent(churn);
butter.setAttribute("name","butter"); i.set((Element)butter);
} else {
makeButter(c);
}
}
}

An Introduction to XML and Web Technologies 9 An Introduction to XML and Web Technologies 10

Parsing and Serializing Validation (DTD)

public class ValidateDTD {


public class ChangeDescription {
public static void main(String[] args) {
public static void main(String[] args) {
try {
try {
SAXBuilder b = new SAXBuilder();
SAXBuilder b = new SAXBuilder();
b.setValidation(true);
Document d = b.build(new File("recipes.xml"));
String msg = "No errors!";
Namespace rcp =
try {
Namespace.getNamespace("http://www.brics.dk/ixwt/recipes");
Document d = b.build(new File(args[0]));
d.getRootElement().getChild("description",rcp)
} catch (JDOMParseException e ) {
.setText("Cool recipes!");
msg = e.getMessage();
XMLOutputter outputter = new XMLOutputter();
}
outputter.output(d,System.out);
System.out.println(msg);
} catch (Exception e) { e.printStackTrace(); }
} catch (Exception e) { e.printStackTrace(); }
}
}
}
}

An Introduction to XML and Web Technologies 11 An Introduction to XML and Web Technologies 12

3
Validation (XML Schema) XPath Evaluation
public class ValidateXMLSchema {
public static void main(String[] args) { void doubleSugar(Document d) throws JDOMException {
try { XPath p = XPath.newInstance("//rcp:ingredient[@name='sugar']");
SAXBuilder b = new SAXBuilder(); p.addNamespace("rcp","http://www.brics.dk/ixwt/recipes");
b.setValidation(true); java.util.Iterator i = p.selectNodes(d).iterator();
b.setProperty( while (i.hasNext()) {
"http://java.sun.com/xml/jaxp/properties/schemaLanguage", Element e = (Element)i.next();
"http://www.w3.org/2001/XMLSchema"); double amount = e.getAttribute("amount").getDoubleValue();
String msg = "No errors!"; e.setAttribute("amount",new Double(2*amount).toString());
try { }
Document d = b.build(new File(args[0])); }
} catch (JDOMParseException e ) {
msg = e.getMessage();
}
System.out.println(msg);
} catch (Exception e) { e.printStackTrace(); }
}
}
An Introduction to XML and Web Technologies 13 An Introduction to XML and Web Technologies 14

XSLT Transformation Business Cards


<cardlist xmlns="http://businesscard.org"
public class ApplyXSLT { xmlns:xhtml="http://www.w3.org/1999/xhtml">
<title>
public static void main(String[] args) {
<xhtml:h1>My Collection of Business Cards</xhtml:h1>
try {
containing people from <xhtml:em>Widget Inc.</xhtml:em>
SAXBuilder b = new SAXBuilder(); </title>
Document d = b.build(new File(args[0])); <card>
XSLTransformer t = new XSLTransformer(args[1]); <name>John Doe</name>
Document h = t.transform(d); <title>CEO, Widget Inc.</title>
XMLOutputter outputter = new XMLOutputter(); <email>john.doe@widget.com</email>
outputter.output(h,System.out); <phone>(202) 555-1414</phone>
</card>
} catch (Exception e) { e.printStackTrace(); }
<card>
}
<name>Joe Smith</name>
} <title>Assistant</title>
<email>thrall@widget.com</email>
</card>
</cardlist>
An Introduction to XML and Web Technologies 15 An Introduction to XML and Web Technologies 16

4
Business Card Editor Class Representation

class Card {
public String name,title,email,phone,logo;

public Card(String name, String title, String email,


String phone, String logo) {
this.name=name;
this.title=title;
this.email=email;
this.phone=phone;
this.logo=logo;
}
}

An Introduction to XML and Web Technologies 17 An Introduction to XML and Web Technologies 18

From JDOM to Classes From Classes to JDOM (1/2)

Vector doc2vector(Document d) {
Vector v = new Vector(); Document vector2doc() {
Iterator i = d.getRootElement().getChildren().iterator(); Element cardlist = new Element("cardlist");
while (i.hasNext()) { for (int i=0; i<cardvector.size(); i++) {
Element e = (Element)i.next(); Card c = (Card)cardvector.elementAt(i);
String phone = e.getChildText("phone",b);
if (c!=null) {
if (phone==null) phone="";
Element card = new Element("card",b);
Element logo = e.getChild("logo",b);
String uri; Element name = new Element("name",b);
if (logo==null) uri=""; name.addContent(c.name); card.addContent(name);
else uri=logo.getAttributeValue("uri"); Element title = new Element("title",b);
Card c = new Card(e.getChildText("name",b), title.addContent(c.title); card.addContent(title);
e.getChildText("title",b), Element email = new Element("email",b);
e.getChildText("email",b),
email.addContent(c.email); card.addContent(email);
phone, uri);
v.add(c);
}
return v;
}

An Introduction to XML and Web Technologies 19 An Introduction to XML and Web Technologies 20

5
From Classes to JDOM (2/2) A Little Bit of Code

if (!c.phone.equals("")) { void addCards() {


Element phone = new Element("phone",b); cardpanel.removeAll();
phone.addContent(c.phone); for (int i=0; i<cardvector.size(); i++) {
card.addContent(phone);
Card c = (Card)cardvector.elementAt(i);
if (c!=null) {
}
Button b = new Button(c.name);
if (!c.logo.equals("")) {
b.setActionCommand(String.valueOf(i));
Element logo = new Element("logo",b);
b.addActionListener(this);
logo.setAttribute("uri",c.logo); cardpanel.add(b);
card.addContent(logo); }
} }
cardlist.addContent(card); this.pack();
} }
}
return new Document(cardlist);
}

An Introduction to XML and Web Technologies 21 An Introduction to XML and Web Technologies 22

The Main Application XML Data Binding

public BCedit(String cardfile) {


ƒ The methods doc2vector and vector2doc are
super("BCedit");
this.cardfile=cardfile; tedious to write
try {
cardvector = doc2vector(
new SAXBuilder().build(new File(cardfile))); ƒ XML data binding provides tools to:
} catch (Exception e) { e.printStackTrace(); } • map schemas to class declarations
// initialize the user interface
• automatically generate unmarshalling code
...
} • automatically generate marshalling code
• automatically generate validation code

An Introduction to XML and Web Technologies 23 An Introduction to XML and Web Technologies 24

6
Binding Compilers The JAXB Framework

ƒ Which schemas are supported? ƒ It supports most of XML Schema


ƒ Fixed or customizable binding? ƒ The binding is customizable (annotations)
ƒ Does roundtripping preserve information? ƒ Roundtripping is almost complete
ƒ What is the support for validation? ƒ Validation is supported during unmarshalling or
ƒ Are the generated classes implemented by some on demand
generic framework? ƒ JAXB only specifies the interfaces to the
generated classes

An Introduction to XML and Web Technologies 25 An Introduction to XML and Web Technologies 26

Business Card Schema (1/3) Business Card Schema (2/3)

<schema xmlns="http://www.w3.org/2001/XMLSchema" <complexType name="cardlist_type">


xmlns:b="http://businesscard.org"
<sequence>
targetNamespace="http://businesscard.org"
<element name="title" type="b:cardlist_title_type"/>
elementFormDefault="qualified">
<element ref="b:card" minOccurs="0" maxOccurs="unbounded"/>
</sequence>
<element name="cardlist" type="b:cardlist_type"/>
</complexType>
<element name="card" type="b:card_type"/>
<element name="name" type="string"/>
<complexType name="cardlist_title_type" mixed="true">
<element name="email" type="string"/>
<sequence>
<element name="phone" type="string"/>
<any namespace="http://www.w3.org/1999/xhtml"
<element name="logo" type="b:logo_type"/>
minOccurs="0" maxOccurs="unbounded"
processContents="lax"/>
<attribute name="uri" type="anyURI"/>
</sequence>
</complexType>

An Introduction to XML and Web Technologies 27 An Introduction to XML and Web Technologies 28

7
Business Card Schema (3/3) The org.businesscard Package

<complexType name="card_type"> ƒ The binding compiler generates :


<sequence> • Cardlist, CardlistType
<element ref="b:name"/>
<element name="title" type="string"/> • CardlistImpl, CardlistTypeImpl
<element ref="b:email"/> • ...
<element ref="b:phone" minOccurs="0"/>
<element ref="b:logo" minOccurs="0"/> • Logo, LogoType
</sequence> • LogoImpl, LogoTypeImpl
</complexType>
• ObjectFactory
<complexType name="logo_type">
<attribute ref="b:uri" use="required"/>
</complexType> ƒ The Title element is not a class, since it is
</schema>
declared as a local element.

An Introduction to XML and Web Technologies 29 An Introduction to XML and Web Technologies 30

The CardType Interface A Little Bit of Code

public interface CardType { void addCards() {


java.lang.String getEmail(); cardpanel.removeAll();
void setEmail(java.lang.String value); Iterator i = cardlist.iterator();
org.businesscard.LogoType getLogo(); int j = 0;
void setLogo(org.businesscard.LogoType value); while (i.hasNext()) {
java.lang.String getTitle(); Card c = (Card)i.next();
void setTitle(java.lang.String value); Button b = new Button(c.getName());
java.lang.String getName(); b.setActionCommand(String.valueOf(j++));
void setName(java.lang.String value); b.addActionListener(this);
java.lang.String getPhone(); cardpanel.add(b);
void setPhone(java.lang.String value); }
} this.pack();
}

An Introduction to XML and Web Technologies 31 An Introduction to XML and Web Technologies 32

8
The Main Application Streaming XML

public BCedit(String cardfile) {


ƒ JDOM and JAXB keeps the entire XML tree in
super("BCedit");
this.cardfile=cardfile; memory
try { ƒ Huge documents can only be streamed:
jc = JAXBContext.newInstance("org.businesscard");
Unmarshaller u = jc.createUnmarshaller();
• movies on the Internet
cl = (Cardlist)u.unmarshal( • Unix file commands using pipes
new FileInputStream(cardfile)
ƒ What is streaming for XML documents?
);
cardlist = cl.getCard();
} catch (Exception e) { e.printStackTrace(); }
// initialize the user interface
ƒ The SAX framework has the answer...
...
}

An Introduction to XML and Web Technologies 33 An Introduction to XML and Web Technologies 34

Parsing Events Tracing All Events (1/4)

ƒ View the XML document as a stream of events: public class Trace extends DefaultHandler {
• the document starts int indent = 0;

• a start tag is encountered void printIndent() {


• an end tag is encountered for (int i=0; i<indent; i++) System.out.print("-");
}
• a namespace declaration is seen
• some whitespace is seen public void startDocument() {
System.out.println("start document");
• character data is encountered }
• the document ends
public void endDocument() {
ƒ The SAX tool observes these events System.out.println("end document");

ƒ It reacts by calling corresponding methods }

specified by the programmer


An Introduction to XML and Web Technologies 35 An Introduction to XML and Web Technologies 36

9
Tracing All Events (2/4) Tracing All Events (3/4)

public void startElement(String uri, String localName,


public void ignorableWhitespace(char[] ch, int start, int length) {
String qName, Attributes atts) {
printIndent();
printIndent();
System.out.println("whitespace, length " + length);
System.out.println("start element: " + qName);
}
indent++;
}
public void processingInstruction(String target, String data) {
printIndent();
public void endElement(String uri, String localName,
System.out.println("processing instruction: " + target);
String qName) {
}
indent--;
printIndent();
public void characters(char[] ch, int start, int length){
System.out.println("end element: " + qName);
printIndent();
}
System.out.println("character data, length " + length);
}

An Introduction to XML and Web Technologies 37 An Introduction to XML and Web Technologies 38

Tracing All Events (4/4) Output for the Recipe Collection


start document
public static void main(String[] args) { start element: rcp:collection
-character data, length 3
try {
-start element: rcp:description
Trace tracer = new Trace();
--character data, length 44
XMLReader reader = XMLReaderFactory.createXMLReader(); --character data, length 3
reader.setContentHandler(tracer); -end element: rcp:description
reader.parse(args[0]); -character data, length 3
} catch (Exception e) { e.printStackTrace(); } -start element: rcp:recipe
} --character data, length 5
--start element: rcp:title
}
---character data, length 42
...
--start element: rcp:nutrition
--end element: rcp:nutrition
--character data, length 3
-end element: rcp:recipe
-character data, length 1
end element: rcp:collection
end document

An Introduction to XML and Web Technologies 39 An Introduction to XML and Web Technologies 40

10
A Simple Streaming Example (1/2) A Simple Streaming Example (2/2)

public class Height extends DefaultHandler { public static void main(String[] args) {
int h = -1; try {
int max = 0; Height handler = new Height();
XMLReader reader = XMLReaderFactory.createXMLReader();
public void startElement(String uri, String localName, reader.setContentHandler(handler);
String qName, Attributes atts) { reader.parse(args[0]);
h++; if (h > max) max = h; System.out.println(handler.max);
} } catch (Exception e) { e.printStackTrace(); }
}
public void endElement(String uri, String localName, }
String qName) {
h--;
}

public void characters(char[] ch, int start, int length){


if (h+1 > max) max = h+1;
}
An Introduction to XML and Web Technologies 41 An Introduction to XML and Web Technologies 42

Comments on The Example SAX May Emulate JDOM (1/2)

public void startElement(String uri, String localName,


ƒ This version is less intuitive (stack-like style) String qName, Attributes atts) {
if (localName.equals("card")) card = new Element("card",b);
ƒ The JDOM version: else if (localName.equals("name"))
field = new Element("name",b);
java.lang.OutOfMemoryError else if (localName.equals("title"))
on 18MB document field = new Element("title",b);
else if (localName.equals("email"))
ƒ The SAX version handles 1.2GB in 51 seconds field = new Element("email",b);
else if (localName.equals("phone"))
field = new Element("phone",b);
else if (localName.equals("logo")) {
field = new Element("logo",b);
field.setAttribute("uri",atts.getValue("","uri"));
}
}

An Introduction to XML and Web Technologies 43 An Introduction to XML and Web Technologies 44

11
SAX May Emulate JDOM (2/2) Using Contextual Information

public void endElement(String uri, String localName,


String qName) { ƒ Check forms beyond W3C validator:
if (localName.equals("card")) contents.add(card);
else if (localName.equals("cardlist")) {
• that all form input tags are inside form tags
Element cardlist = new Element("cardlist",b); • that all form tags have distinct name attributes
cardlist.setContent(contents);
doc = new Document(cardlist);
• that form tags are not nested
} else { ƒ This requires us to keep information about the
card.addContent(field);
field = null; context of the current parsing event
}
}

public void characters(char[] ch, int start, int length) {


if (field!=null)
field.addContent(new String(ch,start,length));
}
An Introduction to XML and Web Technologies 45 An Introduction to XML and Web Technologies 46

Contextual Information in SAX (1/3) Contextual Information in SAX (2/3)


public class CheckForms extends DefaultHandler { public void startElement(String uri, String localName,
int formheight = 0; String qName, Attributes atts) {
HashSet formnames = new HashSet(); if (uri.equals("http://www.w3.org/1999/xhtml")) {
if (localName.equals("form")) {
Locator locator; if (formheight > 0) report("nested forms");
public void setDocumentLocator(Locator locator) { String name = atts.getValue("","name");
this.locator = locator; if (formnames.contains(name))
} report("duplicate form name");
else
void report(String s) { formnames.add(name);
System.out.print(locator.getLineNumber()); formheight++;
System.out.print(":"); } else
System.out.print(locator.getColumnNumber()); if (localName.equals("input") ||
System.out.println(" ---"+s); localName.equals("select") ||
} localName.equals("textarea"))
if (formheight==0) report("form field outside form");
}
}
An Introduction to XML and Web Technologies 47 An Introduction to XML and Web Technologies 48

12
Contextual Information in SAX (3/3) SAX Filters
public void endElement(String uri, String localName,
String qName) {
if (uri.equals("http://www.w3.org/1999/xhtml")) ƒ A SAX application may be turned into a filter
if (localName.equals("form"))
formheight--;
ƒ Filters may be composed (as with pipes)
} ƒ A filter is an event handler that may pass events
public static void main(String[] args) { along in the chain
try {
CheckForms handler = new CheckForms();
XMLReader reader = XMLReaderFactory.createXMLReader();
reader.setContentHandler(handler);
reader.parse(args[0]);
} catch (Exception e) { e.printStackTrace(); }
}
}

An Introduction to XML and Web Technologies 49 An Introduction to XML and Web Technologies 50

A SAX Filter Example (1/4) A SAX Filter Example (2/4)

ƒ A filter to remove processing instructions: ƒ A filter to create unique id attributes:


class PIFilter extends XMLFilterImpl { class IDFilter extends XMLFilterImpl {
public void processingInstruction(String target, String data) int id = 0;
throws SAXException {} public void startElement(String uri, String localName,
} String qName, Attributes atts)
throws SAXException {
AttributesImpl idatts = new AttributesImpl(atts);
idatts.addAttribute("","id","id","ID",
new Integer(id++).toString());
super.startElement(uri,localName,qName,idatts);
}
}

An Introduction to XML and Web Technologies 51 An Introduction to XML and Web Technologies 52

13
A SAX Filter Example (3/4) A SAX Filter Example (4/4)

ƒ A filter to count characters: public class FilterTest {


public static void main(String[] args) {
try {
class CountFilter extends XMLFilterImpl {
FilterTest handler = new FilterTest();
public int count = 0;
XMLReader reader = XMLReaderFactory.createXMLReader();
public void characters(char[] ch, int start, int length)
PIFilter pi = new PIFilter();
throws SAXException {
pi.setParent(reader);
count = count+length;
IDFilter id = new IDFilter();
super.characters(ch,start,length);
id.setParent(pi);
}
CountFilter count = new CountFilter();
}
count.setParent(id);
count.parse(args[0]);
System.out.println(count.count);
} catch (Exception e) { e.printStackTrace(); }
}
}

An Introduction to XML and Web Technologies 53 An Introduction to XML and Web Technologies 54

Pull vs. Push Contextual Information in XMLPull (1/3)

ƒ SAX is known as a push framework


public class CheckForms2 {
static void report(XmlPullParser xpp, String s) {

• the parser has the initivative System.out.print(xpp.getLineNumber());


System.out.print(":");
• the programmer must react to events System.out.print(xpp.getColumnNumber());
System.out.println(" ---"+s);
ƒ An alternative is a pull framework }

• the programmer has the initiative public static void main (String args[])
throws XmlPullParserException, IOException {
• the parser must react to requests XmlPullParserFactory factory = XmlPullParserFactory.newInstance();

ƒ XML Pull is an example of a pull framework factory.setNamespaceAware(true);


factory.setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES, true);

XmlPullParser xpp = factory.newPullParser();

int formheight = 0;
HashSet formnames = new HashSet();

An Introduction to XML and Web Technologies 55 An Introduction to XML and Web Technologies 56

14
Contextual Information in XMLPull (2/3) Contextual Information in XMLPull (3/3)
xpp.setInput(new FileReader(args[0])); else if (eventType==XmlPullParser.END_TAG) {
int eventType = xpp.getEventType(); if (xpp.getNamespace().equals("http://www.w3.org/1999/xhtml")
while (eventType!=XmlPullParser.END_DOCUMENT) { && xpp.getName().equals("form"))
if (eventType==XmlPullParser.START_TAG) { formheight--;
if (xpp.getNamespace().equals("http://www.w3.org/1999/xhtml") }
&& xpp.getName().equals("form")) { eventType = xpp.next();
if (formheight>0) }
report(xpp,"nested forms"); }
String name = xpp.getAttributeValue("","name"); }
if (formnames.contains(name))
report(xpp,"duplicate form name");
else
formnames.add(name);
formheight++;
} else if (xpp.getName().equals("input") ||
xpp.getName().equals("select") ||
xpp.getName().equals("textarea"))
if (formheight==0)
report(xpp,"form field outside form");
} }

An Introduction to XML and Web Technologies 57 An Introduction to XML and Web Technologies 58

Using a Pull Parser Streaming Transformations

ƒ Not that different from the push version ƒ SAX allows the programming of streaming
ƒ More direct programming style applications "by hand"
ƒ Smaller memory footprint ƒ XSLT allows high-level programming of
ƒ Pipelining with filter chains is not available applications
(but may be simulated in languages with higher- ƒ A broad spectrum of these could be streamed
order functions) ƒ But XSLT does not allow streaming...

ƒ Solution: use a domain-specific language for


streaming transformations

An Introduction to XML and Web Technologies 59 An Introduction to XML and Web Technologies 60

15
STX Similarities with XSLT

ƒ STX is a variation of XSLT suitable for streaming ƒ template ƒ text


• some features are not allowed ƒ copy ƒ element
• but every STX application can be streamed ƒ value-of ƒ attribute
ƒ if ƒ variable
ƒ The differences reflect necessary limitations in the
ƒ else ƒ param
control flow
ƒ choose ƒ with-param
ƒ when
ƒ otherwise ƒ Most XSLT functions

An Introduction to XML and Web Technologies 61 An Introduction to XML and Web Technologies 62

Differences with XSLT STXPath

ƒ apply-templates is the main problem: ƒ A subset of XPath 2.0 used by STX


• allows processing to continue anywhere in the tree
• requires moving back and forth in the input file ƒ STXPath expressions:
• or storing the whole document • look like restricted XPath 2.0 expressions
• evaluate to sequences of nodes and atomic values
ƒ mutable variables to accumulate information • but they have a different semantics

An Introduction to XML and Web Technologies 63 An Introduction to XML and Web Technologies 64

16
STXPath Syntax STXPath Semantics

ƒ Must use abbreviated XPath 2.0 syntax ƒ Evaluate the corresponding XPath 2.0 expression
ƒ The axes following and preceding are not ƒ Restrict the result to those nodes that are on the
available ancestor axis
ƒ Extra node tests: cdata() and doctype() ƒ <A>
<B/>
<C><D/></C>
</A>
ƒ Evaluate count(//B) with D as the context node
ƒ With XPath the result is 1
ƒ With STXPath the result is 0
An Introduction to XML and Web Technologies 65 An Introduction to XML and Web Technologies 66

Transformation Sheets A Simple STX Example

ƒ STX use transform instead of stylesheet ƒ Extract comments from recipes:


ƒ apply-templates is not allowed <stx:transform xmlns:stx="http://stx.sourceforge.net/2002/ns"

ƒ Processing is defined by: version="1.0"


xmlns:rcp="http://www.brics.dk/ixwt/recipes">
• process-children
<stx:template match="rcp:collection">
• process-siblings
<comments>
• process-self <stx:process-children/>

ƒ Only a single occurrence of process-children </comments>


</stx:template>
is allowed in each template (to enable streaming)
<stx:template match="rcp:comment">
<comment><stx:value-of select="."/></comment>
</stx:template>
</stx:transform>

An Introduction to XML and Web Technologies 67 An Introduction to XML and Web Technologies 68

17
SAX Version (1/2) SAX Version (2/2)

public void characters(char[] ch, int start, int length) {


public class ExtractComments extends if (chars)
DefaultHandler { System.out.print(new String(ch, start, length));
}
bool chars = true;
public void endElement(String uri, String localName,
String qName) {
public void startElement(String uri, if (uri.equals("http://www.brics.dk/ixwt/recipes")) {
if (localName.equals("collection"))
String localName, System.out.print("</comments>");
if (localName.equals("comment")) {
String
System.out.print("</comment>");
qName, Attributes atts) { chars = false;
}
if }
(uri.equals("http://www.brics.dk/ixwt }
}
/recipes")) {
An Introduction to XML and Web Technologies 69 An Introduction to XML and Web Technologies 70
if

The Ancestor Stack Using process-


process-siblings

<stx:transform xmlns:stx="http://stx.sourceforge.net/2002/ns" <stx:transform xmlns:stx="http://stx.sourceforge.net/2002/ns"


version="1.0"> version="1.0">
<stx:template match="*"> <stx:template match="*">
<stx:message select="concat(count(//*),' ',local-name())"/> <stx:copy>
<stx:process-children/> <stx:process-children/>
</stx:template> <stx:process-siblings/>
</stx:transform> </stx:copy>
</stx:template>
</stx:transform>
<A> 1 A
<B/> 2 B
<a> <a>
<B><C/></B> 2 B
<b><c/></b> <b>
<A/>
3 C
<d><e/></d> <c/>
2 A
<B><A><C/></A></B> <d><e/></d>
2 B </a>
</A> </b>
3 A
</a>
4 C

An Introduction to XML and Web Technologies 71 An Introduction to XML and Web Technologies 72

18
Mutable Variables STX Version of CheckForms (1/2)

<stx:transform xmlns:stx="http://stx.sourceforge.net/2002/ns" <stx:transform xmlns:stx="http://stx.sourceforge.net/2002/ns"


version="1.0" version="1.0"
xmlns:rcp="http://www.brics.dk/ixwt/recipes"> xmlns:xhtml="http://www.w3.org/1999/xhtml">
<stx:variable name="depth" select="0"/> <stx:variable name="formheight" select="0"/>
<stx:variable name="maxdepth" select="0"/> <stx:variable name="formnames" select="'#'"/>

<stx:template match="rcp:collection"> <stx:template match="xhtml:form">


<stx:process-children/> <stx:if test="$formheight&gt;0">
<maxdepth><stx:value-of select="$maxdepth"/></maxdepth> <stx:message select="'nested forms'"/>
</stx:template> </stx:if>
<stx:if test="contains($formnames,concat('#',@name,'#'))">
<stx:template match="rcp:ingredient"> <stx:message select="'duplicate form name'"/>
<stx:assign name="depth" select="$depth + 1"/> </stx:if>
<stx:if test="$depth > $maxdepth"> <stx:assign name="formheight" select="$formheight + 1"/>
<stx:assign name="maxdepth" select="$depth"/> <stx:assign name="formnames"
</stx:if> select="concat($formnames,@name,'#')"/>
<stx:process-children/> <stx:process-children/>
<stx:assign name="depth" select="$depth - 1"/> <stx:assign name="formheight" select="$formheight - 1"/>
</stx:template> </stx:template>
</stx:transform>
An Introduction to XML and Web Technologies 73 An Introduction to XML and Web Technologies 74

STX Version of CheckForms (2/2) Groups (1/2)


<stx:template match="xhtml:input|xhtml:select|xhtml:textarea"> <stx:transform xmlns:stx="http://stx.sourceforge.net/2002/ns"
<stx:if test="$formheight=0"> version="1.0"
<stx:message select="'form field outside form'"/> strip-space="yes">
</stx:if> <stx:template match="person">
<stx:process-children/> <person><stx:process-children/></person>
</stx:template> </stx:template>

</stx:transform> <stx:template match="email">


<emails><stx:process-self group="foo"/></emails>
</stx:template>

<person>
<person> <emails>
<email/><email/><email/> <email/><email/><email/>
<phone/><phone/> </emails>
</person> <phone/><phone/>
</person>

An Introduction to XML and Web Technologies 75 An Introduction to XML and Web Technologies 76

19
Groups (2/2) Limitations of Streaming

ƒ Something we will never write with STX:


<stx:group name="foo">
<stx:template match="email">
<email/>
<stx:process-siblings while="email" group="foo"/>
</stx:template> <xsl:stylesheet version="2.0"
</stx:group>
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<stx:template match="phone">
<phone/> <xsl:template name="mirror" match="/|@*|node()">
</stx:template> <xsl:copy>
</stx:transform> <xsl:apply-templates select="@*"/>
<xsl:apply-templates select="reverse(node())"/>
<person> </xsl:copy>
<person> <emails>
<email/><email/><email/>
</xsl:template>
<email/><email/><email/>
<phone/><phone/> </emails> </xsl:stylesheet>
</person> <phone/><phone/>
</person>

An Introduction to XML and Web Technologies 77 An Introduction to XML and Web Technologies 78

STX for Recipes (1/7) STX for Recipes (2/7)


<stx:transform xmlns:stx="http://stx.sourceforge.net/2002/ns" <stx:template match="rcp:recipe">
version="1.0" <body>
xmlns:rcp="http://www.brics.dk/ixwt/recipes" <table border="1">
xmlns="http://www.w3.org/1999/xhtml" <stx:process-self group="outer"/>
strip-space="yes"> </table>
</body>
<stx:template match="rcp:collection"> </stx:template>
<html>
<stx:process-children/> <stx:group name="outer">
</html> <stx:template match="rcp:description">
</stx:template> <tr>
<td><stx:value-of select="."/></td>
<stx:template match="rcp:description"> </tr>
<head> </stx:template>
<title><stx:value-of select="."/></title>
<link href="style.css" rel="stylesheet" type="text/css"/>
</head>
</stx:template>

An Introduction to XML and Web Technologies 79 An Introduction to XML and Web Technologies 80

20
STX for Recipes (3/7) STX for Recipes (4/7)
<stx:template match="rcp:recipe"> <stx:template match="rcp:ingredient" >
<tr> <ul><stx:process-self group="inner"/></ul>
<td> </stx:template>
<stx:process-children/>
</td> <stx:template match="rcp:preparation">
</tr> <ol><stx:process-children/></ol>
</stx:template> </stx:template>

<stx:template match="rcp:title"> <stx:template match="rcp:step">


<h1><stx:value-of select="."/></h1> <li><stx:value-of select="."/></li>
</stx:template> </stx:template>

<stx:template match="rcp:date"> <stx:template match="rcp:comment">


<i><stx:value-of select="."/></i> <ul>
</stx:template> <li type="square"><stx:value-of select="."/></li>
</ul>
</stx:template>

An Introduction to XML and Web Technologies 81 An Introduction to XML and Web Technologies 82

STX for Recipes (5/7) STX for Recipes (6/7)


<stx:template match="rcp:nutrition"> <stx:group name="inner">
<table border="2"> <stx:template match="rcp:ingredient">
<tr> <stx:choose>
<th>Calories</th><th>Fat</th> <stx:when test="@amount">
<th>Carbohydrates</th><th>Protein</th> <li>
<stx:if test="@alcohol"><th>Alcohol</th></stx:if> <stx:if test="@amount!='*'">
</tr> <stx:value-of select="@amount"/>
<tr> <stx:text> </stx:text>
<td align="right"><stx:value-of select="@calories"/></td> <stx:if test="@unit">
<td align="right"><stx:value-of select="@fat"/></td> <stx:value-of select="@unit"/>
<td align="right"><stx:value-of select="@carbohydrates"/></td> <stx:if test="number(@amount)>number(1)">
<td align="right"><stx:value-of select="@protein"/></td> <stx:text>s</stx:text>
<stx:if test="@alcohol"> </stx:if>
<td align="right"><stx:value-of select="@alcohol"/></td> <stx:text> of </stx:text>
</stx:if> </stx:if>
</tr> </stx:if>
</table> <stx:text> </stx:text>
</stx:template> <stx:value-of select="@name"/>
</stx:group> </li>
</stx:when>
An Introduction to XML and Web Technologies 83 An Introduction to XML and Web Technologies 84

21
STX for Recipes (7/7) XML in Programming Languages

ƒ SAX: programmers react to parsing events


<stx:otherwise>
<li><stx:value-of select="@name"/></li>

ƒ JDOM: a general data structure for XML trees


<stx:process-children group="outer"/>
</stx:otherwise>

ƒ JAXB: a specific data structure for XML trees


</stx:choose>
<stx:process-siblings while="rcp:ingredient" group="inner"/>
</stx:template>
</stx:group>

ƒ These approaches are convenient


</stx:transform>

ƒ But no compile-time guarantees:


• about validity of the constructed XML (JDOM, JAXB)
• well-formedness of the constructed XML (SAX)

An Introduction to XML and Web Technologies 85 An Introduction to XML and Web Technologies 86

Type-
Type-Safe XML Programming Languages XDuce

ƒ With XML schemas as types ƒ A first-order functional language


ƒ Type-checking now guarantees validity ƒ XML trees are native values
ƒ Regular expression types (generalized DTDs)
ƒ An active research area
ƒ Arguments and results are explicitly typed
ƒ Type inference for pattern variables
ƒ Compile-time type checking guarantees:
• XML navigation is safe
• generated XML is valid

An Introduction to XML and Web Technologies 87 An Introduction to XML and Web Technologies 88

22
XDuce Types for Recipes (1/2) XDuce Types for Recipes (2/2)
namespace rcp = "http://www.brics.dk/ixwt/recipes" type Ingredient = rcp:ingredient[@name[String],
@amount[String]?,
type Collection = rcp:collection[Description,Recipe*] @unit[String]?,
type Description = rcp:description[String] (Ingredient*,Preparation)?]
type Recipe = rcp:recipe[@id[String]?, type Preparation = rcp:preparation[Step*]
Title, type Step = rcp:step[String]
Date, type Comment = rcp:comment[String]
Ingredient*, type Nutrition = rcp:nutrition[@calories[String],
Preparation, @carbohydrates[String],
Comment?, @fat[String],
Nutrition, @protein[String],
Related*] @alcohol[String]?]
type Title = rcp:title[String] type Related = rcp:related[@ref[String],String]
type Date = rcp:date[String]

An Introduction to XML and Web Technologies 89 An Introduction to XML and Web Technologies 90

XDuce Types of Nutrition Tables From Recipes to Tables (1/3)

type NutritionTable = nutrition[Dish*] fun extractCollection(val c as Collection) : NutritionTable =


match c with
type Dish = dish[@name[String],
rcp:collection[Description, val rs]
@calories[String], -> nutrition[extractRecipes(rs)]
@fat[String],
@carbohydrates[String], fun extractRecipes(val rs as Recipe*) : Dish* =
match rs with
@protein[String], rcp:recipe[@..,
@alcohol[String]] rcp:title[val t],
Date,
Ingredient*,
Preparation,
Comment?,
val n as Nutrition,
Related*], val rest
-> extractNutrition(t,n), extractRecipes(rest)
| () -> ()

An Introduction to XML and Web Technologies 91 An Introduction to XML and Web Technologies 92

23
From Recipes to Tables (2/3) From Recipes to Tables (3/3)
fun extractNutrition(val t as String, val n as Nutrition) : Dish = | rcp:nutrition[@calories[val calories],
match n with @carbohydrates[val carbohydrates],
rcp:nutrition[@calories[val calories], @fat[val fat],
@carbohydrates[val carbohydrates], @protein[val protein]]
@fat[val fat], -> dish[@name[t],
@protein[val protein], @calories[calories],
@alcohol[val alcohol]] @carbohydrates[carbohydrates],
-> dish[@name[t], @fat[fat],
@calories[calories], @protein[protein],
@carbohydrates[carbohydrates], @alcohol["0%"]]
@fat[fat],
@protein[protein],
@alcohol[alcohol]] let val collection = validate load_xml("recipes.xml") with Collection
let val _ = print(extractCollection(collection))

An Introduction to XML and Web Technologies 93 An Introduction to XML and Web Technologies 94

XDuce Guarantees XACT

ƒ The XDuce type checker determines that: ƒ A Java framework (like JDOM) but:
• every function returns a valid value • it is based on immutable templates, which are
• every function argument is a valid value sequences of XML trees containing named gaps
• every match has an exhaustive collection of patterns • XML trees are constructed by plugging gaps
• every pattern matches some value • it has syntactic sugar for template constants
ƒ Clearly, this will eliminate many potential errors • XML is navigated using XPath
• an analyzer can a compile-time guarantee that an XML
expression is valid according to a given DTD

An Introduction to XML and Web Technologies 95 An Introduction to XML and Web Technologies 96

24
Business Cards to Phone Lists (1/2) Business Cards to Phone Lists (2/2)
import dk.brics.xact.*; XML cardlist = XML.get("file:cards.xml",
import java.io.*; "file:businesscards.dtd",
"http://businesscard.org");
public class PhoneList { XML x = wrapper.plug("TITLE", "My Phone List")
public static void main(String[] args) throws XactException { .plug("MAIN", [[<h:ul><[CARDS]></h:ul>]]);
String[] map = {"c", "http://businesscard.org",
"h", "http://www.w3.org/1999/xhtml"};
XML.setNamespaceMap(map); XMLIterator i = cardlist.select("//c:card[c:phone]").iterator();
while (i.hasNext()) {
XML wrapper = [[<h:html> XML card = i.next();
<h:head> x = x.plug("CARDS",
<h:title><[TITLE]></h:title> [[<h:li>
</h:head> <h:b><{card.select("c:name/text()")}></h:b>,
<h:body> phone: <{card.select("c:phone/text()")}>
<h:h1><[TITLE]></h:h1> </h:li>
<[MAIN]> <[CARDS]>]]);
</h:body> }
</h:html>]]; System.out.println(x);
}
}
An Introduction to XML and Web Technologies 97 An Introduction to XML and Web Technologies 98

XML API A Highly Structured Recipe

ƒ constant(s) build a template constant from s <rcp:recipe id="117">


<rcp:title>Fried Eggs with Bacon</rcp:title>

ƒ x.plug(g,y) plugs the gap g with y <rcp:date>Fri, 10 Nov 2004</rcp:date>


<rcp:ingredient name="fried eggs">

ƒ x.select(p) returns a template containing the <rcp:ingredient name="egg" amount="2"/>


<rcp:preparation>
sequence targets of the XPath expression p <rcp:step>Break the eggs into a bowl.</rcp:step>
<rcp:step>Fry until ready.</rcp:step>
ƒ x.gapify(p,g) replaces the targets of p with </rcp:preparation>
</rcp:ingredient>
gaps named g <rcp:ingredient name="bacon" amount="3" unit="strip"/>
<rcp:preparation>
ƒ get(u,d,n) parses a template from a URL with <rcp:step>Fry the bacon until crispy.</rcp:step>

a DTD and a namespace


<rcp:step>Serve with the eggs.</rcp:step>
</rcp:preparation>

ƒ x.analyze(d,n) guarantees at compile-time <rcp:nutrition calories="517"


fat="64%" carbohydrates="0%" protein="0%"/>
that x is valid given a DTD and a namespace </rcp:recipe>

An Introduction to XML and Web Technologies 99 An Introduction to XML and Web Technologies 100

25
A Flattened Recipe A Recipe Flattener in XACT (1/2)
<rcp:recipe id="117"> public class Flatten {
<rcp:title>Fried Eggs with Bacon</rcp:title> static final String rcp = "http://www.brics.dk/ixwt/recipes";
<rcp:date>Fri, 10 Nov 2004</rcp:date> static final String[] map = { "rcp", rcp };
<rcp:ingredient name="egg" amount="2"/>
<rcp:ingredient name="bacon" amount="3" unit="strip"/> static { XML.setNamespaceMap(map); }
<rcp:preparation>
<rcp:step>Break the eggs into a bowl.</rcp:step> public static void main(String[] args) throws XactException {
<rcp:step>Fry until ready.</rcp:step> XML collection = XML.get("file:recipes.xml",
<rcp:step>Fry the bacon until crispy.</rcp:step> "file:recipes.dtd", rcp);
<rcp:step>Serve with the eggs.</rcp:step> XML recipes = collection.select("//rcp:recipe");
</rcp:preparation> XML result = [[<rcp:collection>
<rcp:nutrition calories="517" <{collection.select("rcp:description")}>
fat="64%" carbohydrates="0%" protein="36%"/> <[MORE]>
</rcp:recipe> </rcp:collection>]];

An Introduction to XML and Web Technologies 101 An Introduction to XML and Web Technologies 102

A Recipe Flattener in XACT (2/2) An Error


XMLIterator i = recipes.iterator(); <rcp:ingredient>
while (i.hasNext()) { <{r.select("rcp:title|rcp:date")}>
XML r = i.next();
<{r.select("//rcp:ingredient[@amount]")}>
result = result.plug("MORE",
<rcp:preparation>
[[<rcp:recipe>
<{r.select("rcp:title|rcp:date")}> <{r.select("//rcp:step")}>
<{r.select("//rcp:ingredient[@amount]")}> </rcp:preparation>
<rcp:preparation> <{r.select("rcp:comment|rcp:nutrition|rcp:related")}>
<{r.select("//rcp:step")}> </rcp:ingredient>
</rcp:preparation>
<{r.select("rcp:comment|rcp:nutrition|rcp:related")}>
</rcp:recipe>
<[MORE]>]]);
}
result.analyze("file:recipes.dtd", rcp);
System.out.println(result);
}
}

An Introduction to XML and Web Technologies 103 An Introduction to XML and Web Technologies 104

26
Caught at Compile-
Compile-Time Essential Online Resources
*** Invalid XML at line 31
sub-element 'rcp:ingredient' of element 'rcp:collection' not declared
required attribute 'name' missing in element 'rcp:ingredient'
ƒ http://www.jdom.org/
sub-element 'rcp:title' of element 'rcp:ingredient' not declared
sub-element 'rcp:related' of element 'rcp:ingredient' not declared
ƒ http://java.sun.com/xml/jaxp/
sub-element 'rcp:nutrition' of element 'rcp:ingredient' not declared
sub-element 'rcp:date' of element 'rcp:ingredient' not declared
ƒ http://java.sun.com/xml/jaxb/
ƒ http://www.saxproject.org/

An Introduction to XML and Web Technologies 105 An Introduction to XML and Web Technologies 106

27

Вам также может понравиться