Академический Документы
Профессиональный Документы
Культура Документы
A database that stores XML documents. An XML database is a system that allows data to be stored in XML format. These data can then be queried, exported and serialized into the desired format.
Term Relational Database Management System (RDBMS) was coined by E.F. Codd in early 1970s RDBMS is a table structure with tuples and attributes XML was developed in late 1990s with the advent of the Web XML is a rework on SGML to answer some of the problems the Web faced as it was growing XML is a tree structure with nodes and branches
business-to-business or web based applications It is suitable for structured, unstructured and semi-structured data (XML documents can be data-centric or document-centric)
Relational Databases are data-centric XML is document centric(can be data centric) This is a challenge in converting XML into Relational Database XML Databases is a solution
Data-centric documents are those containing structured data. Data appear in a regular order. In general, relational databases are efficient enough in storing data contained in data-centric XML documents
<Memo> <Meeting date="23/09/2005" time="10:30AM">Finance Committee</Meeting> <Purpose>Discuss 2006 Budget</Purpose> <Location>Room 923</Location> </Memo>
Document-centric XML documents are those characterized by irregular structure and mixed content
<Memo> Please can Finance Committee members come to <Location>Room 923</Location>on <MeetingDate>23/09/2005</MeetingDate> at <MeetingTime>10:30 AM</MeetingTime> to <Purpose>discuss the 2006 budget</Purpose> </Memo>
Unstructured data;
Structured data;
data can be of any type not necessarily following any format or sequence is not predictable examples include text, video, sound etc.
data is organized in semantic chunks (entities) similar entities are grouped together (relations or classes) entities in the same group have the same descriptions (attributes) descriptions for all entities in a group (schema) have the same defined format have a predefined length are all present and follow the same order
The semi-structured model is a model that in this model, there is no separation between the data and the schema, and the amount of structure used depends on the purpose. semi-structured data
organized in semantic entities similar entities are grouped together entities in same group may not have same attributes order of attributes not necessarily important not all attributes may be required size of same attributes in a group may differ type of same attributes in a group may differ
Actualy a non-XML database There is a layer between XML document and database The layer tranlates the data between XML documents and tables (for RDMBS ex.) Need to support querying , updating and storing XML data. SQL/XML or XQuery can be used to retrieving and modifying database.
Maping is used to translate data between XML document and relational database.
1. table-based mapping:
Xml document have the same structure as a relational database The data is grouped into rows and rows are grouped into tables.
2. object-relational mapping:
XML document is viewed as a set of serialized objects Objects are mapped to tables, properties are mapped to columns, and inter-object relationships are mapped to primary key / foreign key relationships
Mosty used when the stored data is wellstructured can be stored in a relational database easily. The only need to a XML database is to handle translating the data between XML document and tables in database.
Defines a (logical) model for an XML document - as opposed to the data in that document - and stores and retrieves documents according to that model. At a minimum, the model must include elements, attributes, PCDATA, and document order. Has an XML document as its fundamental unit of (logical) storage, just as a relational database has a row in a table as its fundamental unit of (logical) storage. Is not required to have any particular underlying physical storage model. For example, it can be built on a relational, hierarchical, or object-oriented database, or use a proprietary storage format such as indexed, compressed files.
Relational database structure: Airport ID 123 456 678 890 Airport Code DAL CHT RDU JFK Airline ID Arline Name Airplane Boing Airbus
12345
98765
American Airlines
Delta Airlines
Element
Flights Flight Airline Airplane Dest- Airport
Node Type
Root First node Second node Second node Second node
Native XML Databases (NXDs) are not meant to replace existing databases or XML but they intend to provide storage and manipulation of XML documents XML provides many characteristics of relational databases like storage in the form of XML document, schemas in the form of DTDs and XML Schema, query languages like XQuery and XPath and finally APIs like DOM and JDOM XML lacks many of the other characteristics of DBMS such as indexing, transactions, data integrity, triggers, normalization and updates
NXDs store data in the form of XML document This is useful for semi-structured data as storing semi-structure data into a relational database is difficult Advantage: Retrieval is faster as there are no joins while retrieving the document Disadvantage: Difficult to retrieve a different view of the data Example:
Retrieving a particular flight instance is faster Retrieving a list of all airline companies whose flights are flying to RDU airport is difficult
Querying Most NXDs support XPath and XQuery for querying XPath is most commonly used query language for NXDs. But XPath lacks functionality like grouping, sorting, cross document joins, etc. XQuery has overcome these shortcomings Updates XUpdate is used for updating Native XML Databases Uses XPath to identify a set of nodes and then specifies whether to insert or delete these nodes, or insert new nodes before or after them.
Structural indexes
They index location of elements and attributes Used to resolve queries such as, Find all Airline elements Value and structural indexes are combined to resolve queries such as, Find all Airline elements whose value is American Airlines
Full-text indexes
They index individual tokens in text and attribute values Used to resolve queries such as, Find all documents that contain the words American Airlines Used with structural indexes for queries like, Find all documents that contain the words American Airways inside an Airline element
Similar to relational databases, normalization can be done on NXDs as well but. XML supports multi-valued properties . Thus NXDs are normalized even when they have multi-valued attributes 1NF of relational database meaningless in context of NXDs Thus normalization is a non-issue for many NXDs
Native XML Databases support APIs They are generally similar to ODBC-like interface with methods for connecting to the databases and retrieving results Results are returned in the form of XML string or DOM tree or XML Reader Two commonly used APIs are:
The need to combine the features of both native and XML-enabled databases has led to the creation of a new category of databases call hybrid XML databases. Hybrid XML databases are usually relational database products extended with native XML support. Hybrid XML databases are ideal for applications which at one point stored data in relational form, but now need to move to the XML world; performing such a data transformation within a single DBMS greatly simplifies the task
Type Relational-XML Enabled Native XML Database Native XML Database Native XML Database XML support since version 9i (Hybrid)
IBM
Commercial
Ref: XML Database Products Copyright 2000-2010 by Ronald Bourret Last updated on: June 20, 2010
<bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title lang="en">XQuery Kick Start</title> <author>James McGovern</author> <author>Per Bothner</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <price>49.99</price> </book> <book category="WEB"> <title lang="en">Learning XML <size>100</size> </title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>
/bookstore/book/title
<title lang="en">Everyday Italian</title> <title lang="en">Harry Potter</title> <title lang="en">XQuery Kick Start</title> <title lang="en">Learning XML <size>100</size> </title>
/bookstore/book[price>30]/title
<title lang="en">XQuery Kick Start</title> <title lang="en">Learning XML <size>100</size> </title>
The output is in the HTML format. We eliminated the title element, and show only the data inside the title element
XML Databases - George Papamarkos, Lucas Zamboulis, Alexandra Poulovassilis School of Computer Science and Information Systems,Birkbeck College, University of London (http://www.dcs.bbk.ac.uk/~sven/adm08/xmlDBs.pdf) NATIVE XML DATABASES vs. RELATIONAL DATABASES IN DEALING WITH XML DOCUMENTS - Gordana PavlovicLazetic - Kragujevac J. Math. 30 (2007) 181-199 http://www.w3schools.com (Examples) Keyword Search over Hybrid XML-Relational Databases Liru Zhang Tadashi Ohmori and Mamoru Hoshi http://www.dcs.bbk.ac.uk/~ptw/teaching/ssd/toc.html