Академический Документы
Профессиональный Документы
Культура Документы
Views of an XML document - Syntax of XML- XML Document Structure Namespaces- XML
Schemas- simple XML documents Different forms of markup that can occur in XML
documents - Document Type declarations Creating XML DTDs
I INTRODUCTION TO SGML
Markup language refers to the traditional way of marking up a document. It determines the
structure and meaning of textual elements .It consists of codes and tags that are added to the text
to change the look or meaning of text or document.
It is used to generate the code that is specific to a particular application. Examples are
It is generated to solve some problems associated with porting documents from one platform
and operating system configuration to another .GML is introduced by Dr.C.F Goldforb in
1960s.It is first developed for IBM.
SGML Structure
An SGML application consists of two parts SGML declaration and SGML DTD (Document type
Definitions).
SGML Declaration - The declaration parts identifies the characters to be used in a document .It
provides a way to identify the objects that will be used throughout the SGML document. These
objects are called Entities
SGML DTD In the Document Type Definition we can list the element type we wish to use in
your document and indicating the structural order in which they can occur
SGML Features
XML FEATURES
6. XML has the ability to work with HTML for data display and presentation
7. It is a standard language used to structure and describe data that can be understood by
different application.
9. XML tags are not predefined . you must define your own tags.
12. XML includes specification for a hyper linking scheme , which is described as a separate
language called eXtensible Link Language ( XLL )
13. Every XML document consists of data and markup.you can literally tag up your data with
your own tags .
14. XML can be used as a data interchange format .Since the XML text format is standards
based ,data can be converted and then easily read by another system or application
SGML is a very powerful, very general and a standard markup language. But with
that power comes the increased complexity.
XML is a subset of SGML intended to make SGML light enough for use on
web.
As XML is a proper subset of SGML, all XML documents are valid SGML
documents .But not all SGML documents are valid XML document.
SGML
XML
The complexity of implementing SGMLs power limits its users to big companies
that need all that power. Hence XML the simplified SGML that retains most of the
inherent power of SGML in a simple ,tidy ,easy-to-use and easy-to-implement form
arrived.
Since XML is optimized for use on the World Wide Web, it is designed in such a
way that it has some benefits that are not found in SGML.
XML becomes a smaller language than SGML because the designers of XML
removed some specification in SGML that was not needed for web delivery..
COMPARISON OF HTML AND XML
HTML XML
It is used for displaying information and to It is designed to describe data and to focus on
format the document what data is?
INTERNAL DTD
Internal DTDs are also known as Internal Subset.This is a sample XML document with
internal DTD
<?xml version=1.0?>
<!DOCTYPE mail [
<!ELEMENT mail(to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<mail>
<to>Rani</to>
<from>Ravi</from>
<heading>Remainder</heading>
</mail>
a reserved word
EXTERNAL DTD
If the Document Type Declaration is external then the DTD must be specified either as
SYSTEM or PUBLIC in the Document Type Declaration.
If SYSTEM the DTD resides on the local hard disk and may not be available for use by other
applications. The External subset, if present, consists of a reference to an external entity
following the DOCTYPE keyword as illustrated here:
//mail.xml
<?xml version=1.0?>
<mail>
<to>Rani</to>
<from>Ravi</from>
<heading>Remainder</heading>
//mail.dtd
<?xml version=1.0?>
<!ELEMENT mail(to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
The DTD can be housed exclusively by either the external or internal subset or both.
//apples.dtd
<!ELEMENT apples(#PCDATA)>
//apples.xml
<apples color=green>12</apples>
Internal DTD
//apples.xml
<!DOCTYPE apples [
<!ELEMENT apples(#PCDATA)>
]>
<apples>12</apples>
ELEMENT TYPE DECLARATION
Every element in a valid XML document must have an element type declared in the DTD.
To validate an XML document ,a validating parser needs to know three things about each
element
1) What the element type is named
2) What elements of that type can contain(content model)
3) What attributes an element of that type has associated
Both the element type name and its content model are declared together in what is known as
Element Type declaration
Element Type declaration must start with the string <!ELEMENT followed by the name and
content specification
Every element has certain allowed content. there are four general types of content
specification
1)EMPTY content may not have content
<name>aaa</name>
<address>SJCET,pala</address>
<phone>239301</phone>
</contact>
<name>aaa</name>
<phone>239301</phone></contact>
<fruit><apple>---</apple></fruit>
<apple>---</apple>
<orange>--</orange>
</fruit>
Ex2:<fruit>
<orange>--</orange></fruit>
Ex2:<fruit></fruit>
<list>---</list></para>
Ex2:<para>aaa,bbb,ccc</para>
Ex3:<para><list>---</list></para>
Attributes need to be declared in the DTD for validating XML parser to check that they have
been used properly in an XML document.
You can also have multiple attribute list declaration for a single element
<!ATTLIST person email CDATA #REQUIRED>
<!ATTLIST person phone CDATA #REQUIRED>
Each attribute in a declaration has three parts: a name, type and default value. The table below
shows the partial attribute list declaration
ATTRIBUTE TYPES
String Attribute
Attribute Description:
Type:
CDATA CDATA stands for character data, that is, text that does not form markup
Tokenized
Attribute Description:
Attribute Type:
The first character of an NMTOKEN value must be a letter, digit, '.', '-', '_',
NMTOKEN
or ':'
Enumerated
Attribute Description:
Attribute Type:
CDATA Example:
<?xml version="1.0"?>
<!DOCTYPE image [
<!ELEMENT image EMPTY>
<!ATTLIST image height CDATA #REQUIRED>
<!ATTLIST image width CDATA #REQUIRED>
]>
<image height="32" width="32"/>
ID Example:
<?xml version="1.0"?>
<!DOCTYPE student_name [
<!ELEMENT student_name (#PCDATA)>
<!ATTLIST student_name student_no ID #REQUIRED>
]>
<student_name student_no="a9216735">Jo Smith</student_name>
IDREF Example:
ENTITY Example:
ENTITIES Example:
NMTOKEN Example:
<?xml version="1.0"?>
<!DOCTYPE student_name [
<!ELEMENT student_name (#PCDATA)>
<!ATTLIST student_name student_no NMTOKEN #REQUIRED>
]>
<student_name student_no="9216735">Jo Smith</student_name>
ATTRIBUTE DEFAULTS
An Attribute list declaration includes information about whether or not a value must be supplied
for it and if not,what the XML processor should do.
2)Implied --->the XML processor tells the application that no value was supplied.The
application can decide what best to do.
3)Fixed --->A value is supplied in the declaration. No value need be supplied in the document
and the XML processor will pass the specified fixed value through the document. If a value is
supplied in the document, it must exactly match the fixed value.
#REQUIRED -The attribute must have an explicitly specified value on every occurrence of
the element in the document.
An element of type product has an attribute called name whose value can be any string of chars
except <,>,&.The value must be supplied when it is used in the document. <product
name=Acmepc>
In this example the type attribute of the fruit element is declared to be required.<!DOCTYPE
fruit[ <!ELEMENT fruit EMPTY>
<fruit type=apple/>
<fruit />
#IMPLIED -These are attributes that can be left unspecified if desired. The XML processor
passes the fact that the attribute was unspecified through out the XML application, which can
then choose what best to do.
Valid document.<!DOCTYPE fruit[<!ELEMENT fruit EMPTY>
<fruit />
An element of type product has an attribute called color. Color attribute must be either string
red or green. If the value is not supplied, leave it up to the XML application to decide what
to do.
#FIXED An attribute declaration may specify that an attribute has a fixed value. In this case
attribute is not required, but if it occurs it must have a specific value.
An element of type product has an attribute called name having a fixed value Acmepc. Any
other value is an Error.<product name=Acmepc>