Академический Документы
Профессиональный Документы
Культура Документы
Database System Concepts - 5th Edition, Aug 22, 2005. 10.2 ©Silberschatz, Korth and Sudarshan
Introduction
Database System Concepts - 5th Edition, Aug 22, 2005. 10.3 ©Silberschatz, Korth and Sudarshan
XML Introduction (Cont.)
The ability to specify new tags, and to create nested tag structures make
XML a great way to exchange data, not just documents.
z Much of the use of XML has been in data exchange applications, not as a
replacement for HTML
Tags make data (relatively) self-documenting
z E.g.
<bank>
<account>
<account_number> A-101 </account_number>
<branch_name> Downtown </branch_name>
<balance> 500 </balance>
</account>
<depositor>
<account_number> A-101 </account_number>
<customer_name> Johnson </customer_name>
</depositor>
</bank>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.4 ©Silberschatz, Korth and Sudarshan
XML: Motivation
Database System Concepts - 5th Edition, Aug 22, 2005. 10.5 ©Silberschatz, Korth and Sudarshan
XML Motivation (Cont.)
Earlier generation formats were based on plain text with line headers
indicating the meaning of fields
z Similar in concept to email headers
z Does not allow for nested structures, no standard “type” language
z Tied too closely to low level document structure (lines, spaces, etc)
Each XML based standard defines what are valid elements, using
z XML type specification languages to specify the syntax
DTD (Document Type Descriptors)
XML Schema
z Plus textual descriptions of the semantics
XML allows new tags to be defined as required
z However, this may be constrained by DTDs
A wide variety of tools is available for parsing, browsing and querying XML
documents/data
Database System Concepts - 5th Edition, Aug 22, 2005. 10.6 ©Silberschatz, Korth and Sudarshan
Comparison with Relational Data
Database System Concepts - 5th Edition, Aug 22, 2005. 10.7 ©Silberschatz, Korth and Sudarshan
Structure of XML Data
Database System Concepts - 5th Edition, Aug 22, 2005. 10.8 ©Silberschatz, Korth and Sudarshan
Example of Nested Elements
<bank-1>
<customer>
<customer_name> Hayes </customer_name>
<customer_street> Main </customer_street>
<customer_city> Harrison </customer_city>
<account>
<account_number> A-102 </account_number>
<branch_name> Perryridge </branch_name>
<balance> 400 </balance>
</account>
<account>
…
</account>
</customer>
.
.
</bank-1>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.9 ©Silberschatz, Korth and Sudarshan
Motivation for Nesting
Database System Concepts - 5th Edition, Aug 22, 2005. 10.10 ©Silberschatz, Korth and Sudarshan
Structure of XML Data (Cont.)
Mixture of text with sub-elements is legal in XML.
z Example:
<account>
This account is seldom used any more.
<account_number> A-102</account_number>
<branch_name> Perryridge</branch_name>
<balance>400 </balance>
</account>
z Useful for document markup, but discouraged for data
representation
Database System Concepts - 5th Edition, Aug 22, 2005. 10.11 ©Silberschatz, Korth and Sudarshan
Attributes
Database System Concepts - 5th Edition, Aug 22, 2005. 10.12 ©Silberschatz, Korth and Sudarshan
Attributes vs. Subelements
Database System Concepts - 5th Edition, Aug 22, 2005. 10.13 ©Silberschatz, Korth and Sudarshan
Namespaces
Database System Concepts - 5th Edition, Aug 22, 2005. 10.14 ©Silberschatz, Korth and Sudarshan
More on XML Syntax
Database System Concepts - 5th Edition, Aug 22, 2005. 10.15 ©Silberschatz, Korth and Sudarshan
XML Document Schema
Database System Concepts - 5th Edition, Aug 22, 2005. 10.16 ©Silberschatz, Korth and Sudarshan
Document Type Definition (DTD)
Database System Concepts - 5th Edition, Aug 22, 2005. 10.17 ©Silberschatz, Korth and Sudarshan
Element Specification in DTD
Subelements can be specified as
z names of elements, or
z #PCDATA (parsed character data), i.e., character strings
z EMPTY (no subelements) or ANY (anything can be a subelement)
Example
<! ELEMENT depositor (customer_name account_number)>
<! ELEMENT customer_name (#PCDATA)>
<! ELEMENT account_number (#PCDATA)>
Subelement specification may have regular expressions
<!ELEMENT bank ( ( account | customer | depositor)+)>
Notation:
– “|” - alternatives
– “+” - 1 or more occurrences
– “*” - 0 or more occurrences
Database System Concepts - 5th Edition, Aug 22, 2005. 10.18 ©Silberschatz, Korth and Sudarshan
Bank DTD
<!DOCTYPE bank [
<!ELEMENT bank ( ( account | customer | depositor)+)>
<!ELEMENT account (account_number branch_name balance)>
<! ELEMENT customer(customer_name customer_street
customer_city)>
<! ELEMENT depositor (customer_name account_number)>
<! ELEMENT account_number (#PCDATA)>
<! ELEMENT branch_name (#PCDATA)>
<! ELEMENT balance(#PCDATA)>
<! ELEMENT customer_name(#PCDATA)>
<! ELEMENT customer_street(#PCDATA)>
<! ELEMENT customer_city(#PCDATA)>
]>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.19 ©Silberschatz, Korth and Sudarshan
Attribute Specification in DTD
Database System Concepts - 5th Edition, Aug 22, 2005. 10.21 ©Silberschatz, Korth and Sudarshan
Bank DTD with Attributes
Database System Concepts - 5th Edition, Aug 22, 2005. 10.22 ©Silberschatz, Korth and Sudarshan
XML data with ID and IDREF attributes
<bank-2>
<account account_number=“A-401” owners=“C100 C102”>
<branch_name> Downtown </branch_name>
<balance> 500 </balance>
</account>
<customer customer_id=“C100” accounts=“A-401”>
<customer_name>Joe </customer_name>
<customer_street> Monroe </customer_street>
<customer_city> Madison</customer_city>
</customer>
<customer customer_id=“C102” accounts=“A-401 A-402”>
<customer_name> Mary </customer_name>
<customer_street> Erin </customer_street>
<customer_city> Newark </customer_city>
</customer>
</bank-2>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.23 ©Silberschatz, Korth and Sudarshan
Limitations of DTDs
Database System Concepts - 5th Edition, Aug 22, 2005. 10.24 ©Silberschatz, Korth and Sudarshan
XML Schema
Database System Concepts - 5th Edition, Aug 22, 2005. 10.25 ©Silberschatz, Korth and Sudarshan
XML Schema Version of Bank DTD
<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema>
<xs:element name=“bank” type=“BankType”/>
<xs:element name=“account”>
<xs:complexType>
<xs:sequence>
<xs:element name=“account_number” type=“xs:string”/>
<xs:element name=“branch_name” type=“xs:string”/>
<xs:element name=“balance” type=“xs:decimal”/>
</xs:squence>
</xs:complexType>
</xs:element>
….. definitions of customer and depositor ….
<xs:complexType name=“BankType”>
<xs:squence>
<xs:element ref=“account” minOccurs=“0” maxOccurs=“unbounded”/>
<xs:element ref=“customer” minOccurs=“0” maxOccurs=“unbounded”/>
<xs:element ref=“depositor” minOccurs=“0” maxOccurs=“unbounded”/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.26 ©Silberschatz, Korth and Sudarshan
XML Schema Version of Bank DTD
Database System Concepts - 5th Edition, Aug 22, 2005. 10.27 ©Silberschatz, Korth and Sudarshan
More features of XML Schema
Database System Concepts - 5th Edition, Aug 22, 2005. 10.28 ©Silberschatz, Korth and Sudarshan
Querying and Transforming XML Data
Database System Concepts - 5th Edition, Aug 22, 2005. 10.29 ©Silberschatz, Korth and Sudarshan
Tree Model of XML Data
Database System Concepts - 5th Edition, Aug 22, 2005. 10.30 ©Silberschatz, Korth and Sudarshan
XPath
Database System Concepts - 5th Edition, Aug 22, 2005. 10.31 ©Silberschatz, Korth and Sudarshan
XPath (Cont.)
The initial “/” denotes root of the document (above the top-level tag)
Path expressions are evaluated left to right
z Each step operates on the set of instances produced by the previous
step
Selection predicates may follow any step in a path, in [ ]
z E.g. /bank-2/account[balance > 400]
returns account elements with a balance value greater than 400
/bank-2/account[balance] returns account elements containing a
balance subelement
Attributes are accessed using “@”
z E.g. /bank-2/account[balance > 400]/@account_number
returns the account numbers of accounts with balance > 400
z IDREF attributes are not dereferenced automatically (more on this
later)
Database System Concepts - 5th Edition, Aug 22, 2005. 10.32 ©Silberschatz, Korth and Sudarshan
Functions in XPath
XPath provides several functions
z The function count() at the end of a path counts the number of
elements in the set generated by the path
E.g. /bank-2/account[count(./customer) > 2]
– Returns accounts with > 2 customers
z Also function for testing position (1, 2, ..) of node w.r.t. siblings
Boolean connectives and and or and function not() can be used in
predicates
IDREFs can be referenced using function id()
z id() can also be applied to sets of references such as IDREFS and
even to strings containing multiple references separated by blanks
z E.g. /bank-2/account/id(@owner)
returns all customers referred to from the owners attribute of
account elements.
Database System Concepts - 5th Edition, Aug 22, 2005. 10.33 ©Silberschatz, Korth and Sudarshan
More XPath Features
Operator “|” used to implement union
z E.g. /bank-2/account/id(@owner) | /bank-2/loan/id(@borrower)
Gives customers with either accounts or loans
However, “|” cannot be nested inside other operators.
“//” can be used to skip multiple levels of nodes
z E.g. /bank-2//customer_name
finds any customer_name element anywhere under the
/bank-2 element, regardless of the element in which it is
contained.
A step in the path can go to parents, siblings, ancestors and
descendants of the nodes generated by the previous step, not just
to the children
z “//”, described above, is a short from for specifying “all
descendants”
z “..” specifies the parent.
doc(name) returns the root of a named document
Database System Concepts - 5th Edition, Aug 22, 2005. 10.34 ©Silberschatz, Korth and Sudarshan
XQuery
XQuery is a general purpose query language for XML data
Currently being standardized by the World Wide Web Consortium
(W3C)
z The textbook description is based on a January 2005 draft of the
standard. The final version may differ, but major features likely to
stay unchanged.
XQuery is derived from the Quilt query language, which itself borrows
from SQL, XQL and XML-QL
XQuery uses a
for … let … where … order by …result …
syntax
for Ù SQL from
where Ù SQL where
order by Ù SQL order by
result Ù SQL select
let allows temporary variables, and has no equivalent in SQL
Database System Concepts - 5th Edition, Aug 22, 2005. 10.35 ©Silberschatz, Korth and Sudarshan
FLWOR Syntax in XQuery
For clause uses XPath expressions, and variable in for clause ranges over
values in the set returned by XPath
Simple FLWOR expression in XQuery
z find all accounts with balance > 400, with each result enclosed in an
<account_number> .. </account_number> tag
for $x in /bank-2/account
let $acctno := $x/@account_number
where $x/balance > 400
return <account_number> { $acctno } </account_number>
z Items in the return clause are XML text unless enclosed in {}, in which
case they are evaluated
Let clause not really needed in this query, and selection can be done In
XPath. Query can be written as:
for $x in /bank-2/account[balance>400]
return <account_number> { $x/@account_number }
</account_number>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.36 ©Silberschatz, Korth and Sudarshan
Joins
Joins are specified in a manner very similar to SQL
for $a in /bank/account,
$c in /bank/customer,
$d in /bank/depositor
where $a/account_number = $d/account_number
and $c/customer_name = $d/customer_name
return <cust_acct> { $c $a } </cust_acct>
The same query can be expressed with the selections specified as
XPath selections:
for $a in /bank/account
$c in /bank/customer
$d in /bank/depositor[
account_number = $a/account_number and
customer_name = $c/customer_name]
return <cust_acct> { $c $a } </cust_acct>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.37 ©Silberschatz, Korth and Sudarshan
Nested Queries
The following query converts data from the flat structure for bank
information into the nested structure used in bank-1
<bank-1> {
for $c in /bank/customer
return
<customer>
{ $c/* }
{ for $d in /bank/depositor[customer_name = $c/customer_name],
$a in /bank/account[account_number=$d/account_number]
return $a }
</customer>
} </bank-1>
$c/* denotes all the children of the node to which $c is bound, without the
enclosing top-level tag
$c/text() gives text content of an element without any subelements / tags
Database System Concepts - 5th Edition, Aug 22, 2005. 10.38 ©Silberschatz, Korth and Sudarshan
Sorting in XQuery
The order by clause can be used at the end of any expression. E.g. to return customers
sorted by name
for $c in /bank/customer
order by $c/customer_name
return <customer> { $c/* } </customer>
Use order by $c/customer_name to sort in descending order
Can sort at multiple levels of nesting (sort by customer_name, and by account_number
within each customer)
<bank-1> {
for $c in /bank/customer
order by $c/customer_name
return
<customer>
{ $c/* }
{ for $d in /bank/depositor[customer_name=$c/customer_name],
$a in /bank/account[account_number=$d/account_number] }
order by $a/account_number
return <account> $a/* </account>
</customer>
} </bank-1>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.39 ©Silberschatz, Korth and Sudarshan
Functions and Other XQuery Features
Database System Concepts - 5th Edition, Aug 22, 2005. 10.40 ©Silberschatz, Korth and Sudarshan
XSLT
Database System Concepts - 5th Edition, Aug 22, 2005. 10.41 ©Silberschatz, Korth and Sudarshan
XSLT Templates
Example of XSLT template with match and select part
<xsl:template match=“/bank-2/customer”>
<xsl:value-of select=“customer_name”/>
</xsl:template>
<xsl:template match=“*”/>
The match attribute of xsl:template specifies a pattern in XPath
Elements in the XML document matching the pattern are processed by the
actions within the xsl:template element
z xsl:value-of selects (outputs) specified values (here, customer_name)
For elements that do not match any template
z Attributes and text contents are output as is
z Templates are recursively applied on subelements
The <xsl:template match=“*”/> template matches all
elements that do not match any other template
z Used to ensure that their contents do not get output.
If an element matches several templates, only one is used based on a
complex priority scheme/user-defined priorities
Database System Concepts - 5th Edition, Aug 22, 2005. 10.42 ©Silberschatz, Korth and Sudarshan
Creating XML Output
Any text or tag in the XSL stylesheet that is not in the xsl namespace
is output as is
E.g. to wrap results in new XML elements.
<xsl:template match=“/bank-2/customer”>
<customer>
<xsl:value-of select=“customer_name”/>
</customer>
</xsl;template>
<xsl:template match=“*”/>
z Example output:
<customer> Joe </customer>
<customer> Mary </customer>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.43 ©Silberschatz, Korth and Sudarshan
Creating XML Output (Cont.)
Note: Cannot directly insert a xsl:value-of tag inside another tag
E.g. cannot create an attribute for <customer> in the previous example
z
by directly using xsl:value-of
z XSLT provides a construct xsl:attribute to handle this situation
xsl:attribute adds attribute to the preceding element
E.g. <customer>
<xsl:attribute name=“customer_id”>
<xsl:value-of select = “customer_id”/>
</xsl:attribute>
</customer>
results in output of the form
<customer customer_id=“….”> ….
xsl:element is used to create output elements with computed names
Database System Concepts - 5th Edition, Aug 22, 2005. 10.44 ©Silberschatz, Korth and Sudarshan
Structural Recursion
Template action can apply templates recursively to the contents of a
matched element
<xsl:template match=“/bank”>
<customers>
<xsl:template apply-templates/>
</customers >
</xsl:template>
<xsl:template match=“/customer”>
<customer>
<xsl:value-of select=“customer_name”/>
</customer>
</xsl:template>
<xsl:template match=“*”/>
Example output:
<customers>
<customer> John </customer>
<customer> Mary </customer>
</customers>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.45 ©Silberschatz, Korth and Sudarshan
Joins in XSLT
Database System Concepts - 5th Edition, Aug 22, 2005. 10.46 ©Silberschatz, Korth and Sudarshan
Sorting in XSLT
Using an xsl:sort directive inside a template causes all elements
matching the template to be sorted
z Sorting is done before applying other templates
<xsl:template match=“/bank”>
<xsl:apply-templates select=“customer”>
<xsl:sort select=“customer_name”/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match=“customer”>
<customer>
<xsl:value-of select=“customer_name”/>
<xsl:value-of select=“customer_street”/>
<xsl:value-of select=“customer_city”/>
</customer>
<xsl:template>
<xsl:template match=“*”/>
Database System Concepts - 5th Edition, Aug 22, 2005. 10.47 ©Silberschatz, Korth and Sudarshan
Application Program Interface
Database System Concepts - 5th Edition, Aug 22, 2005. 10.48 ©Silberschatz, Korth and Sudarshan
Storage of XML Data
Database System Concepts - 5th Edition, Aug 22, 2005. 10.49 ©Silberschatz, Korth and Sudarshan
Storage of XML in Relational Databases
Alternatives:
z String Representation
z Tree Representation
z Map to relations
Database System Concepts - 5th Edition, Aug 22, 2005. 10.50 ©Silberschatz, Korth and Sudarshan
String Representation
Store each top level element as a string field of a tuple in a relational
database
z Use a single relation to store all elements, or
z Use a separate relation for each top-level element type
E.g. account, customer, depositor relations
– Each with a string-valued attribute to store the element
Indexing:
z Store values of subelements/attributes to be indexed as extra fields
of the relation, and build indices on these fields
E.g. customer_name or account_number
z Some database systems support function indices, which use the
result of a function as the key value.
The function should return the value of the required
subelement/attribute
Database System Concepts - 5th Edition, Aug 22, 2005. 10.51 ©Silberschatz, Korth and Sudarshan
String Representation (Cont.)
Benefits:
z Can store any XML data even without DTD
z As long as there are many top-level elements in a document,
strings are small compared to full document
Allows fast access to individual elements.
Drawback: Need to parse strings to access values inside the elements
z Parsing is slow.
Database System Concepts - 5th Edition, Aug 22, 2005. 10.52 ©Silberschatz, Korth and Sudarshan
Tree Representation
Tree representation: model XML data as tree and store using relations
nodes(id, type, label, value)
child (child_id, parent_id)
bank (id:1)
customer_name account_number
(id: 3) (id: 7)
Database System Concepts - 5th Edition, Aug 22, 2005. 10.53 ©Silberschatz, Korth and Sudarshan
Tree Representation (Cont.)
Database System Concepts - 5th Edition, Aug 22, 2005. 10.54 ©Silberschatz, Korth and Sudarshan
Mapping XML Data to Relations
Database System Concepts - 5th Edition, Aug 22, 2005. 10.55 ©Silberschatz, Korth and Sudarshan
Storing XML Data in Relational Systems
Database System Concepts - 5th Edition, Aug 22, 2005. 10.56 ©Silberschatz, Korth and Sudarshan
SQL/XML
Database System Concepts - 5th Edition, Aug 22, 2005. 10.57 ©Silberschatz, Korth and Sudarshan
SQL Extensions
Database System Concepts - 5th Edition, Aug 22, 2005. 10.58 ©Silberschatz, Korth and Sudarshan
Web Services
Database System Concepts - 5th Edition, Aug 22, 2005. 10.59 ©Silberschatz, Korth and Sudarshan