Вы находитесь на странице: 1из 18

XML was designed to transport and store data.

HTML was designed to display data.


xML sLands for LxLenslble Markup Language


The Difference Between XML and HTML
XML is not a replacement Ior HTML.
XML and HTML were designed with diIIerent goals:
O XML was designed to transport and store data, with Iocus on what data is
O HTML was designed to display data, with Iocus on how data looks
HTML is about displaying inIormation, while XML is about carrying inIormation.
kML |s used |n many aspecLs of web developmenL ofLen Lo s|mp||fy data storage and shar|ng

XML Makes Your Data More Available
iIIerent applications can access your data, not only in HTML pages, but also Irom XML
data sources.
With XML, your data can be available to all kinds oI "reading machines" (Handheld
computers, voice machines, news Ieeds, etc), and make it more available Ior blind people, or
people with other disabilities.
XML is Used to Create New Internet Languages
A lot oI new Internet languages are created with XML.
Here are some examples:
O XHTML
O WSL Ior describing available web services
O WAP and WML as markup languages Ior handheld devices
O #SS languages Ior news Ieeds
O # and OWL Ior describing resources and ontology
O SMIL Ior describing multimedia Ior the web
If Developers Have Sense
If they DO have sense, future applications will exchange their data in XML.
The Iuture might give us word processors, spreadsheet applications and databases that can
read each other's data in XML Iormat, without any conversion utilities in between.
An Example XML Document
XML documents use a selI-describing and simple syntax:
?xml verslon10 encodlnglSC88391?
noLe
Lo1ove/Lo
from!anl/from
headlng8emlnder/headlng
bodyuonL forgeL me Lhls weekend!/body
/noLe
The Iirst line is the XML declaration. It deIines the XML version (1.0) and the encoding used
(ISO-8859-1 Latin-1/West European character set).
The next line describes the root element oI the document (like saying: "this document is a
note"):
noLe
The next 4 lines describe 4 child elements oI the root (to, Irom, heading, and body):
Lo1ove/Lo
from!anl/from
headlng8emlnder/headlng
bodyuonL forgeL me Lhls weekend!/body
And Iinally the last line deIines the end oI the root element:
/noLe
You can assume, Irom this example, that the XML document contains a note to Tove Irom
Jani.


XML Documents Form a Tree Structure
XML documents must contain a root element. This element is "the parent" oI all other
elements.
The elements in an XML document Iorm a document tree. The tree starts at the root and
branches to the lowest level oI the tree.
All elements can have sub elements (child elements):
rooL
chlld
subchlld/subchlld
/chlld
/rooL
The terms parent, child, and sibling are used to describe the relationships between elements.
Parent elements have children. Children on the same level are called siblings (brothers or
sisters).
All elements can have text content and attributes (just like in HTML).

Example:

The image above represents one book in the XML below:
booksLore
book caLegoryCCCklnC
LlLle langenLveryday lLallan/LlLle
auLhorClada ue LaurenLlls/auLhor
year2003/year
prlce3000/prlce
/book
book caLegoryCPlLu8Ln
LlLle langenParry oLLer/LlLle
auLhor! k 8owllng/auLhor
year2003/year
prlce2999/prlce
/book
book caLegoryWL8
LlLle langenLearnlng xML/LlLle
auLhorLrlk 1 8ay/auLhor
year2003/year
prlce3993/prlce
/book
/booksLore
The root element in the example is bookstore~. All book~ elements in the document are
contained within bookstore~.
The book~ element has 4 children: title~, author~, year~, price~.

XML Tags are Case Sensitive
XML Attribute Values Must be Quoted
All XML Elements Must Have a Closing Tag
XML Elements Must be Properly Nested
Entity References
There are 5 predeIined entity reIerences in XML:
< less than
> ~ greater than
& & ampersand
' ' apostrophe
" " quotation mark
Note: Only the characters "" and "&" are strictly illegal in XML. The greater than character
is legal, but it is a good habit to replace it.

Comments in XML
The syntax Ior writing comments in XML is similar to that oI HTML.
!-- This is a comment --~

hite-space is Preserved in XML
HTML truncates multiple white-space characters to one single white-space:
P1ML Pello 1ove
CuLpuL Pello 1ove
With XML, the white-space in a document is not truncated.

XML Stores New Line as LF
In Windows applications, a new line is normally stored as a pair oI characters: carriage return
(C#) and line Ieed (L). In Unix applications, a new line is normally stored as an L
character. Macintosh applications also use an L to store a new line.
XML stores a new line as L.

XML Naming Rules
XML elements must Iollow these naming rules:
O ames can contain letters, numbers, and other characters
O ames cannot start with a number or punctuation character
O ames cannot start with the letters xml (or XML, or Xml, etc)
O ames cannot contain spaces
Any name can be used, no words are reserved.
Best Naming Practices
Make names descriptive. ames with an underscore separator are nice: Iirstname~,
lastname~.
ames should be short and simple, like this: booktitle~ not like this:
thetitleoIthebook~.
Avoid "-" characters. II you name something "Iirst-name," some soItware may think you
want to subtract name Irom Iirst.
Avoid "." characters. II you name something "Iirst.name," some soItware may think that
"name" is a property oI the object "Iirst."
Avoid ":" characters. Colons are reserved to be used Ior something called namespaces (more
later).
XML documents oIten have a corresponding database. A good practice is to use the naming
rules oI your database Ior the elements in the XML documents.
on-English letters like eoa are perIectly legal in XML, but watch out Ior problems iI your
soItware vendor doesn't support them.

XML Attributes
In HTML, attributes provide additional inIormation about elements:
lmg srccompuLerglf
a hrefdemoasp
Attributes oIten provide inIormation that is not a part oI the data. In the example below, the
Iile type is irrelevant to the data, but can be important to the soItware that wants to
manipulate the element:
flle LypeglfcompuLerglf/flle


XML Attributes Must be Quoted
Attribute values must always be quoted. Either single or double quotes can be used. or a
person's sex, the person element can be written like this:
person sexfemale
or like this:
person sexfemale
II the attribute value itselI contains double quotes you can use single quotes, like in this
example:
gangsLer nameCeorge ShoLgun Zlegler
or you can use character entities:
gangsLer nameCeorge quoLShoLgunquoL Zlegler


XML Elements vs. Attributes
Take a look at these examples:
person sexfemale
flrsLnameAnna/flrsLname
lasLnameSmlLh/lasLname
/person

person
sexfemale/sex
flrsLnameAnna/flrsLname
lasLnameSmlLh/lasLname
/person
In the Iirst example sex is an attribute. In the last, sex is an element. Both examples provide
the same inIormation.
There are no rules about when to use attributes or when to use elements. Attributes are handy
in HTML. In XML my advice is to avoid them. Use elements instead.
Avoid XML Attributes?
Some oI the problems with using attributes are:
O attributes cannot contain multiple values (elements can)
O attributes cannot contain tree structures (elements can)
O attributes are not easily expandable (Ior Iuture changes)
Attributes are diIIicult to read and maintain. Use elements Ior data. Use attributes Ior
inIormation that is not relevant to the data.
XML Attributes for Metadata
Sometimes I reIerences are assigned to elements. These Is can be used to identiIy XML
elements in much the same way as the id attribute in HTML. This example demonstrates this:
messages
noLe ld301
Lo1ove/Lo
from!anl/from
headlng8emlnder/headlng
bodyuonL forgeL me Lhls weekend!/body
/noLe
noLe ld302
Lo!anl/Lo
from1ove/from
headlng8e 8emlnder/headlng
bodyl wlll noL/body
/noLe
/messages
The id attributes above are Ior identiIying the diIIerent notes. It is not a part oI the note itselI.
What I'm trying to say here is that metadata (data about data) should be stored as attributes,
and the data itselI should be stored as elements.
xML valldaLlon

XML with correct syntax is "Well ormed" XML.
XML validated against a T is "Valid" XML.

ell Formed XML Documents
A "Well ormed" XML document has correct XML syntax.
The syntax rules were described in the previous chapters:
O XML documents must have a root element
O XML elements must have a closing tag
O XML tags are case sensitive
O XML elements must be properly nested
O XML attribute values must be quoted
?xml version"1.0" encoding"ISO-8859-1"?~
note~
to~Tove/to~
Irom~Jani/Irom~
heading~#eminder/heading~
body~on't Iorget me this weekend!/body~
/note~


Valid XML Documents
A "Valid" XML document is a "Well ormed" XML document, which also conIorms to the
rules oI a ocument Type eIinition (T):
?xml version"1.0" encoding"ISO-8859-1"?~
!OCTYPE note SYSTEM "ote.dtd"~
note~
to~Tove/to~
Irom~Jani/Irom~
heading~#eminder/heading~
body~on't Iorget me this weekend!/body~
/note~
The OCTYPE declaration in the example above, is a reIerence to an external T Iile. The
content oI the Iile is shown in the paragraph below.

XML DTD
The purpose oI a T is to deIine the structure oI an XML document. It deIines the structure
with a list oI legal elements:
!OCTYPE note
|
!ELEMET note (to,Irom,heading,body)~
!ELEMET to (#PCATA)~
!ELEMET Irom (#PCATA)~
!ELEMET heading (#PCATA)~
!ELEMET body (#PCATA)~
|~
II you want to study T, you will Iind our T tutorial on our homepage.

XML Schema
W3C supports an XML-based alternative to T, called XML Schema:
xs:element name"note"~

xs:complexType~
xs:sequence~
xs:element name"to" type"xs:string"/~
xs:element name"Irom" type"xs:string"/~
xs:element name"heading" type"xs:string"/~
xs:element name"body" type"xs:string"/~
/xs:sequence~
/xs:complexType~

/xs:element~
II you want to study XML Schema, you will Iind our Schema tutorial on our homepage.

XML Errors ill Stop You
Errors in XML documents will stop your XML applications.
The W3C XML speciIication states that a program should stop processing an XML document
iI it Iinds an error. The reason is that XML soItware should be small, Iast, and compatible.
HTML browsers will display documents with errors (like missing end tags). HTML browsers
are big and incompatible because they have a lot oI unnecessary code to deal with (and
display) HTML errors.
ith XML, errors are not allowed.

Displaying XML witb CSS
Below is a Iraction oI the XML Iile. The second line links the XML Iile to the CSS Iile:
?xml version"1.0" encoding"ISO-8859-1"?~
?xml-stylesheet type"text/css" hreI"cdcatalog.css"?~
CATALOG~
C~
TITLE~Empire Burlesque/TITLE~
A#TIST~Bob ylan/A#TIST~
COUT#Y~USA/COUT#Y~
COMPAY~Columbia/COMPAY~
P#ICE~10.90/P#ICE~
YEA#~1985/YEA#~
/C~
C~
TITLE~Hide your heart/TITLE~
A#TIST~Bonnie Tyler/A#TIST~
COUT#Y~UK/COUT#Y~
COMPAY~CBS #ecords/COMPAY~
P#ICE~9.90/P#ICE~
YEA#~1988/YEA#~
/C~
.
.
.
/CATALOG~
ormatting XML with CSS is not the most common method.
W3C recommends using XSLT instead. See the next chapter.

Displaying XML witb XSLT
With XSLT you can transIorm an XML document into HTML.

Displaying XML with XSLT
XSLT is the recommended style sheet language oI XML.
XSLT (eXtensible Stylesheet Language TransIormations) is Iar more sophisticated than CSS.
xSL1 can be used Lo Lransform xML lnLo P1ML before lL ls dlsplayed by a browser

Transforming XML with XSLT on the Server
In the example above, the XSLT transIormation is done by the browser, when the browser
reads the XML Iile.
iIIerent browsers may produce diIIerent result when transIorming XML with XSLT. To
reduce this problem the XSLT transIormation can be done on the server.
XSL consists oI three parts:
O XSLT - a language Ior transIorming XML documents
O XPath - a language Ior navigating in XML documents
O XSL-O - a language Ior Iormatting XML documents
XSLT Uses XPath
XSLT uses XPath to Iind inIormation in an XML document. XPath is used to navigate
through elements and attributes in XML documents.
Correct Style Sheet Declaration
The root element that declares the document to be an XSL style sheet is xsl:stylesheet~ or
xsl:transIorm~.
Note: xsl:stylesheet~ and xsl:transIorm~ are completely synonymous and either can be
used!
Link the XSL Style Sheet to the XML Document
Add the XSL style sheet reIerence to your XML document ("cdcatalog.xml"):
?xml verslon10 encodlnglSC88391?
?xmlsLylesheeL LypeLexL/xsl hrefcdcaLalogxsl?
caLalog
cd
LlLleLmplre 8urlesque/LlLle
arLlsL8ob uylan/arLlsL
counLryuSA/counLry
companyColumbla/company
prlce1090/prlce
year1983/year
/cd


/caLalog
II you have an XSLT compliant browser it will nicely transform your XML into XHTML.


?xml verslon10?

xslsLylesheeL verslon10
xmlnsxslhLLp//wwww3org/1999/xSL/1ransform

xslLemplaLe maLch/
hLml
body
h2My Cu CollecLlon/h2
Lable border1
Lr bgcolor#9acd32
Lh1lLle/Lh
LhArLlsL/Lh
/Lr
xslforeach selecLcaLalog/cd
Lr
Ldxslvalueof selecLLlLle//Ld
Ldxslvalueof selecLarLlsL//Ld
/Lr
/xslforeach
/Lable
/body
/hLml
/xslLemplaLe

/xslsLylesheeL


XSLT <xsl:template> Element
revlous nexL ChapLer

An XSL style sheet consists oI one or more set oI rules that are called templates.
A template contains rules to apply when a speciIied node is matched.

The <xsl:template> Element
The xsl:template~ element is used to build templates.
The match attribute is used to associate a template with an XML element. The match
attribute can also be used to deIine a template Ior the entire XML document. The value oI the
match attribute is an XPath expression (i.e. match"/" deIines the whole document).
?xml verslon10 encodlnglSC88391?
xslsLylesheeL verslon10
xmlnsxslhLLp//wwww3org/1999/xSL/1ransform

xslLemplaLe maLch/
hLml
body
h2My Cu CollecLlon/h2
Lable border1
Lr bgcolor#9acd32
Lh1lLle/Lh
LhArLlsL/Lh
/Lr
Lr
Ld/Ld
Ld/Ld
/Lr
/Lable
/body
/hLml
/xslLemplaLe

/xslsLylesheeL
Slnce an xSL sLyle sheeL ls an xML documenL lL always beglns wlLh Lhe xML declaraLlon "m|
vers|on10 encod|ngISC88S91"
1he nexL elemenL s|sty|esheet deflnes LhaL Lhls documenL ls an xSL1 sLyle sheeL documenL
(along wlLh Lhe verslon number and xSL1 namespace aLLrlbuLes)
1he s|temp|ate elemenL deflnes a LemplaLe 1he match] aLLrlbuLe assoclaLes Lhe LemplaLe
wlLh Lhe rooL of Lhe xML source documenL

XSLT <xsl:value-of> Element
revlous nexL ChapLer

The xsl:value-oI~ element is used to extract the value oI a selected node.

The <xsl:value-of> Element
The xsl:value-oI~ element can be used to extract the value oI an XML element and add it to
the output stream oI the transIormation:
Example
?xml verslon10 encodlnglSC88391?
xslsLylesheeL verslon10
xmlnsxslhLLp//wwww3org/1999/xSL/1ransform

xslLemplaLe maLch/
hLml
body
h2My Cu CollecLlon/h2
Lable border1
Lr bgcolor#9acd32
Lh1lLle/Lh
LhArLlsL/Lh
/Lr
Lr
Ldxslvalueof selecLcaLalog/cd/LlLle//Ld
Ldxslvalueof selecLcaLalog/cd/arLlsL//Ld
/Lr
/Lable
/body
/hLml
/xslLemplaLe

/xslsLylesheeL

Example Explained
Note: The select attribute in the example above, contains an XPath expression. An XPath
expression works like navigating a Iile system; a Iorward slash (/) selects subdirectories.
The result Irom the example above was a little disappointing; only one line oI data was
copied Irom the XML document to the output. In the next chapter you will learn how to use
the <xsl:for-each> element to loop through the XML elements, and display all oI the records.
The <xsl:for-each> Element
The XSL xsl:Ior-each~ element can be used to select every XML element oI a speciIied
node-set:
Filtering the Output
We can also Iilter the output Irom the XML Iile by adding a criterion to the select attribute in
the xsl:Ior-each~ element.
<xsl:for-each select"catalog/cdartist'Bob Dylan']">
Legal Iilter operators are:
O (equal)
O ! (not equal)
O &lt; less than
O &gt; greater than
Take a look at the adjusted XSL style sheet:
Example
?xml version"1.0" encoding"ISO-8859-1"?~
xsl:stylesheet version"1.0"
xmlns:xsl"http://www.w3.org/1999/XSL/TransIorm"~

xsl:template match"/"~
html~
body~
h2~My C Collection/h2~
table border"1"~
tr bgcolor"#9acd32"~
th~Title/th~
th~Artist/th~
/tr~
xsl:Ior-each select"catalog/cd|artist'Bob ylan'|"~
tr~
td~xsl:value-oI select"title"/~/td~
td~xsl:value-oI select"artist"/~/td~
/tr~
/xsl:Ior-each~
/table~
/body~
/html~
/xsl:template~

/xsl:stylesheet~

The xsl:sort~ element is used to sort the output.

here to put the Sort Information
To sort the output, simply add an xsl:sort~ element inside the xsl:Ior-each~ element in the
XSL Iile:
Example
?xml verslon10 encodlnglSC88391?
xslsLylesheeL verslon10
xmlnsxslhLLp//wwww3org/1999/xSL/1ransform

xslLemplaLe maLch/
hLml
body
h2My Cu CollecLlon/h2
Lable border1
Lr bgcolor#9acd32
Lh1lLle/Lh
LhArLlsL/Lh
/Lr
xslforeach selecLcaLalog/cd
xslsorL selecLarLlsL/
Lr
Ldxslvalueof selecLLlLle//Ld
Ldxslvalueof selecLarLlsL//Ld
/Lr
/xslforeach
/Lable
/body
/hLml
/xslLemplaLe

/xslsLylesheeL

1ry lL yourself
Note: The select attribute indicates what XML element to sort on.
XSLT <xsl:if> Element
revlous nexL ChapLer

The xsl:iI~ element is used to put a conditional test against the content oI the XML Iile.

The <xsl:if> Element
To put a conditional iI test against the content oI the XML Iile, add an xsl:iI~ element to the
XSL document.
Syntax
xsllf LesL
some ouLpuL lf Lhe expresslon ls Lrue
/xsllf
XSLT <xsl:cboose> Element
revlous nexL ChapLer

The xsl:choose~ element is used in conjunction with xsl:when~ and xsl:otherwise~ to
express multiple conditional tests.

The <xsl:choose> Element
Syntax
xslchoose
xslwhen LesL
some ouLpuL
/xslwhen
xsloLherwlse
some ouLpuL
/xsloLherwlse
/xslchoose

Вам также может понравиться