Вы находитесь на странице: 1из 26

UNIT IV

Introduction to WYSIWYG design tools


Introduction to Dreamweaver
Website creation and maintenance
Web hosting and Publishing Concepts

WYSIWYG
Pronounced WIZ-zee-wig. Short for what you see is what you get. A WYSIWYG
application is one that enables you to see on the display screen exactly what will
appear when the document is printed. This differs, for example, from word processors
that are incapable of displaying different fonts and graphics on the display screen even
though the formatting codes have been inserted into the file. WYSIWYG is especially
popular for desktop publishing.

The actual meaning depends on the user's perspective, e.g.


In presentation programs, compound documents and web pages, WYSIWYG means
the display precisely represents the appearance of the page displayed to the end-
user, but does not necessarily reflect how the page will be printed unless the printer
is specifically matched to the editing program, as it was with the Xerox Star and
early versions of the Apple Macintosh.
In word processing and desktop publishing applications, WYSIWYG means that the
display simulates the appearance and represents the effect of fonts and line breaks
on the final pagination using a specific printer configuration, so that, for example, a
citation on page 1 of a 500-page document can accurately refer to a reference three
hundred pages later.
WYSIWYG also describes ways to manipulate 3D models in stereo-chemistry,
computer-aided design, and 3D computer graphics.

XML
Introduction
Features of Markup Languages
Difference Between HTML and XML
Advantage of HTML over other mark up languages
Drawback of HTML
XML Naming Rules
Building Block of XML Document
XML Schema
Components of XML
XML Parsers
DTDs using XML with HTML and CSS.

Introduction

Extensible Markup Language (XML) is a markup language that defines a set of rules
for encoding documents in a format that is both human-readable and machine-readable.
It is defined in the XML 1.0 Specification produced by the W3C, and several other
related specifications, all gratis open standards. The design goals of XML emphasize
simplicity, generality, and usability over the Internet. It is a textual data format with
strong support via Unicode for the languages of the world. Although the design of
XML focuses on documents, it is widely used for the representation of arbitrary data
structures, for example in web services.
Many application programming interfaces (APIs) have been developed to aid software
developers with processing XML data, and several schema systems exist to aid in the
definition of XML-based languages. As of 2009, hundreds of XML-based languages
have been developed, including RSS, Atom, SOAP, and XHTML. XML-based formats
have become the default for many office-productivity tools, including Microsoft Office
(Office Open XML), OpenOffice.org and LibreOffice (OpenDocument), and Apple's
iWork. XML has also been employed as the base language for communication
protocols, such as XMPP.

Features of Markup Language


Here is an introduction for what is called "Markup" and Markup languages. A markup
is anything added to a text document that conveys an extra information. That is, if we
want to display a word in italic form or in bold form, we use the corresponding markup
tags for that particular word. Thus markup describes exactly how the document should
appear on the screen or on the printed page. Markup is being used extensively not in
computing world but also in electronic documents, such as word processor files and
Latex files so that the marked up words can be displayed on the screen and the printed
page in the intended format.

Markup languages are a product of the information age. They are a formalization of the
codes used to markup the content of electronic documents that is, a set of conventions
defining things such as a) what marks or elements are allowed, b) where elements may
occur, c) whether any or all of the elements must occur somewhere in a document to
which the language has been applied.

There are two types of markup and hence there exists two types of markup languages.
They are procedural and generalized.
Procedural Markup Languages - Procedural markup is typified by its use in typesetting
and publishing systems, including word processors. The elements are placed right in
the flow of text, and the markup languages that define them have the following
characteristics:

1. Documents marked up with procedural markup languages contain clear instructions


for the document-rendering program, so that it produces output of the original
content in a particular format and style

2. The formatting instructions are likely to be specific to the output medium, so the
document containing the original content interspersed with markup is not portable
across different output media.

A common procedural markup language is the Rich Text Format (RTF). There are
markup elements such as \par for paragraph, \b for bold, \i for italic and so on. This
sort of markup languages is good for the task of formatting if the documents are
always destined for the printed page or any other single medium. PostScript and TeX
are the most popular procedural markup languages.

Generalized Markup Languages (GML) - There is a major shortcoming in the


procedural markup languages. That is, if we intend to extract information from the
documents, procedural markup language is found wanting. To meet this challenge, a
generalized markup language marks up documents in a different way. The
characteristics of these languages are:

1. The elements have logical names, rather than expressing detailed formatting
instructions. For example, an element H1 is being used to mark up text that is
intended to be a first-level header.

2. Software applications that read documents marked up using a GML are free to
present them as they see fit, using formatting rules for particular elements that are
either defined internally, or specified elsewhere. When displayed in the screen, the
H1 element could be associated with a particular combination of font size and
weight.

GML elements usually involve both start and end tags so that the original content is
fully contained inside an element. Also there is no hint about how this document
should be presented. That is, the web browsers are free to reflect the meanings of these
elements. The most commonly used generalized markup language is HTML for web
documents and WML for mobile contents.
Generalized Markup Rule-sets -> Each GML has its own elements, its own rules, and
its own particular area of application and in order for the language to function properly
as a language, those must all somehow be defined. GMLs are themselves written using
Generalized Markup Rule-sets (GMRS), also called meta-languages. The two famous
meta-languages are Standard Generalized Markup Language (SGML) and the
Extensible Markup Language (XML). In a way, XML is a subset of SGML and WML
is derived from XML.

SGML is an international standard designed to integrate documents in different


proprietary formats, and to enable sharing of documents among the text editing,
formatting, and retrieval subsystems. SGML has been approved by the International
Standards Organization(ISO). SGML is not used to mark up a document, but it is a
meta-language that is used to create markup languages that suit different application
domains.

The basic design principles of SGML emphasize the importance of separating


formatting instructions from content. When Internet Explorer displays the HTML
document that contains h1 and em elements, it do so by using an internal stylesheet
that specifies how these elements have to be represented on the screen. In general, it is
possible to create several stylesheets that contain instructions for outputting the
content of a marked up document to a variety of output devices. SGML insists that the
names of the elements will describe what their content represents in the application
domain.

Document Type Definition (DTD)


A Document Type Definition (DTD) is used to define the set of valid elements for a
particular GML as well as the content model of each element. To facilitate electronic
processing, documents marked up using GMLs are highly structured and the notion of
elements being able to contain other elements is a powerful one in SGML. For
instance, element always contain one or more elements, which in turn would contain
many elements.

There are a couple of rules for using SGML. The elements in SGML-based languages
are not case sensitive, unless specified explicitly in the DTD. SGML allows tag
omission, cross-element nesting and mixed-case element names in its applications.
Finally, a DTD is required for every SGML document to check for its validity. This
means that documents can not use any element which is not mentioned in the DTD file.

SGML, as a meta-language, is being used to derive markup languages for different


communities such as publishers, academic institutions, and government organizations.
These derived languages are the applications of SGML. Hypertext Markup Language
(HTML) is the DTD used to describe content in a web document.
HTML has been an excellent standard for web publishing. It has been developed into a
powerful tool that provides a wealth of well-defined presentation elements for marking
up web documents. However, we can not make use of elements that are specific to
another application domain to structure our document content. That is, we are confined
to the set of HTML elements in all application domains. The solution for this issue
came in the form of Extensible Markup Language (XML), a subset of SGML.

XML is intended to retain SGML's ability to define new sets of elements. That is, XML
is a meta-language for creating other markup languages. Also XML documents contain
markup that describes precisely what the marked up content is. That means an XML
document can simply use text to store data, making delivery of information over the
Internet easy, fast, and independent of any particular platform. XML should support
document publishing as strongly as HTML, but the goal of separating presentation
from data is to be upheld. Stylesheets are being involved in this process. Also XML
enforces strict rules on its applications than what SGML does to its applications. This
helps to reduce the complexity of software, such as XML parser processing XML
documents. This ultimately paves for better performance.
Recently another markup language called as Extensible HTML (XHTML) for future
web publishing came out. XHTML is a reformulation of HTML 4 in XML. XHTML
has to be designed to avoid some undesirable features of HTML.

HTML defines a fixed set of elements that any HTML document can use. There is no
flexible mechanism to extend the set of elements as the needs arise in our application
domain. HTML does not provide a mechanism to expand the valid element set to
include newer and informative tags. XHTML has been blessed with the feature of
expanding the element set. Also HTML documents can not be used for data processing
by other software. XHTML make rendering information with the format one intended
harder to achieve using HTML.

XHTML & HTML


XHTML is a reformulation of HTML as an XML application. The transitional form
preserves many of the basic presentation features of HTML 4.0 transitional but
applies the strict syntax rules of XML to HTML. All (X) HTML document should
follow a formal structure defined by W3C, which is the primary organization that
defines Web standards.
W3C defined HTML as an application of the Standard Generalized Markup
Language (SGML).
SGML is a technology used to define markup languages by specifying the allowed
document structure in the form of document type definition (DTD).
The DTD defined in XML for the XHTML language is actually similar to the DTD
defined for traditional HTML.
The structure of an XHTML document is pretty much the same with the exception
of a different <!DOCTYPE> indicator and an xmlns (XML name space) attribute
added to the HTML tag so that it is possible to intermix XML more easily into the
XHTML document.
Difference between HTML & XML
HTML XML
HTML was designed to display data with the XML was designed to be a software and
focus on how data looks. hardware independent tool used to transport
and store data.
HTML is a markup language XML provides a framework for defining
markup languages.
HTML is case insensitive XML is case sensitive
HTML is a presentation language. XML is neither a programming language nor
presentation language.
HTML is used to design a web page to be XML is used to transport data between
rendered at client-side. application and database.
HTML has its own pre-defined tags. In XML we can define custom tags and tags
invented by author of XML document.
HTML is not strict, i.e, no error if the user In XML, it is mandatory for each user to close
doesnt close the closing tag. the tag it has used.
HTML does not preserve white space. XML preserves white space.
HTML is about displaying data, thus static. XML is about carrying information, thus
dynamic.

Advantage of HTML over other markup languages


HTML is generally very well accepted across various display programs. Email
readers, web browsers, word processors, spreadsheets, etc, etc. Many programs
know how to read, display, or output HTML-- so it can be very handy for
universality.
It's easy to learn (though not easy to master).
It's universal.
It's versatile.
It's relatively compact.
It's human readable.

Drawback of HTML
In formatting, HTML is weakened because of its complexity and incompatibility.
Browsers don't all adhere to a single standard, where other formatting (.doc files,
.txt files, etc) are more universal and/or are proprietary, so will always look the
same.
HTML is static. It's not a programming language, it's a markup language, so you
can't do things like save user input, let users log in, etc. You can fake it by inserting
JavaScript and other applications in there, but HTML files themselves aren't
dynamic.
HTML is ugly to read. Because it's often so complex and in-depth, there aren't any
adopted standards for formatting or composition. So interpreting what's going on is
often very difficult.
HTML is often not "pure". It's often got Javascript or CSS mixed in, and is
integrated into templates like PHP or ColdFusion, or even made quasi-dynamic with
SSI (Server-Side Includes).
HTML is able to specify the content and format of document, but not the structure
of it.
HTML is not extensible it does not allow custom tags.
HTML provides only one view of data It is difficult to write HTML that displays
the same data in different ways based on the requests made by the user.
HTML is very display-centric and not structure-oriented HTML does give the web
designer a lot of flexibility when it comes to displaying data. It can make the text
appear in bold, italics and also allow features for the alignment of images. However,
it has little or no semantic structure. It represents data by layout rather than by its
meaning. Representing data by meaning rather than by layout offers a lot of
advantages when efficient searches are to be conducted by search engines.

Introduction
XML is a language used to create other markup languages to describe data in a
structured manner.
XML documents contain only data not formatting instructions, so applications that
process XML documents must decide how to manipulate or display the documents
data.
Programmers use Extensible Style Sheet (XSL) to specify rendering instructions
for different platforms. XML elements describe data, so XML processing programs
can search, sort, manipulate and render XML documents using technologies such as
the Extensible Style Sheet language (XSL).
XML permits document authors to create markup for virtually any type of
information. This extensibility enables document authors to create entirely new
markup language for describing data such as mathematical formulas, chemical
molecular structure, music, news and recipes.
XML documents are highly portable. Viewing or modifying an XML document
which typically ends with the .xml filename extension does not require special
software. Any text editor which supports ASCII/ Unicode characters can open XML
documents for viewing and editing.
XML is both human readable and machine readable.
Processing an /xml document requires a software program called an XML parser or
XML processor. Parser checks an XML documents syntax and enable software
programs to process marked-up data. XML parsers can support the Document
Object Model (DOM) or Simple API for XML (SAX).
An XML document can reference a Document Type Definition (DTD) or a schema
that defines the proper structure of the XML document. When an XML document
references a DTD or a schema, some parsers called validating parsers can read the
DTD/ schema and check that the XML document follows the structure defined by
the DTD/ schema. If the XML document conforms to the DTD/ Schema, the XML
document is valid. Parsers that cannot check for document conformity against
DTDs/ Schema are nonvalidating parsers. If an XML parser can process an XML
document successfully, that XML document is well-formed.

Example code
<? Xml version = 1.0 ?>
<article>
<title> Simple XML </title>
<date> April 12, 2013 </date>
<author>
<firstName> Ruchi </firstName>
<lastName> Kawatra </lastName>
</author>
<summary> XML is very easy </summary>
<content> XML is very easy once you have learned HTML. XML is not for displaying
information but for managing information </content>
</article>

Every XML document must contain one root element which encompasses all other
elements. The article is the root element and all other lines that precede the root
elements are XML prolog.

XML Naming Rules

XML elements must follow these naming rules:


Names can contain letters, numbers, and other characters
Names cannot start with a number or punctuation character
Names cannot start with the letters xml (or XML, or Xml, etc)
Names cannot contain spaces.
Any name can be used, no words are reserved.
XML is case-sensitive.
Attributes values must be given in quotation marks.
XML Namespaces
XML allows document authors to create custom elements. This extensibility can
result in naming collisions among elements in an XML document.
An XML namespace is a collection of element and attribute names. Each
namespace has a unique name that provides a means for document authors to
unambiguously refer to elements with the same name.
Example
<fruit> Water Melon </fruit>
and
<fruit> Grapes </fruit>
In both cases we have same name element fruit. However in the first case the fruit is
of summer season and in second case the fruit is of winter season.
Namespaces can differentiate these two elements
<summer : fruit> Water Melon </summer : fruit>
And <winter : fruit > Grapes </winter : fruit>

Both summer and winter are namespace prefixes. A document author places a
namespace prefix and colon (:) before an element or attribute name to specify the
namespace for that element or attribute.
Each namespace prefix has a corresponding Uniform Resource Identifier (URI) that
uniquely identifies the namespace. A URI is simply a string of text for
differentiating names. A URI can refer to a document, a resource or anything on
web either by name or address.
Example of namespace

<? xml version = 1.0 ?>


<text:directory xmlns:text =urn:bca4:textInfo xmlns:image=
urn:bca4:Imageinfo>
<text:file filename = book.xml >
<text:description> Books for Reference </text:description>
</text:file>
<image:file filename = smiley.jpg>
<image:description> Smiling Face </image:description>
<image:size width = 200 height = 100 />
</image:file>
</text:directory>

Document authors can create their own namespaces prefixes using virtually any
name except the reserved namespace xml.
Document authors must provide a unique URI to ensure that a namespace is unique.
In the above code, urn:bca4:textInfo and urn:bca4:Imageinfo are URIs for the text
and image namespace prefixes respectively. Document authors commonly use
Universal Resource Locator (URL) instead of URIs because the domain names in
URLs must be unique.
For instance, we can use the code
<text:directory xmlns:text = http://www.bca4.com/text-data xmlns:image =
http://www.bca4.com/image-data >
They simply represent a unique series of characters for differentiating URI names.
To eliminate the need to place namespaces prefixes in each element, document
authors may specify a default namespace for an element and its children. We
declare a default namespace by using keyword xmlns and specifying the namespace
URI. Once this default namespace is in place, elements that are declared under the
default namespace and their children do not need namespace prefixes to be part of
the default namespace.

<? xml version = 1.0 ?>


<directory xmlns:text =urn:bca4:textInfo xmlns:image= urn:bca4:Imageinfo>
<file filename = book.xml >
<description> Books for Reference </description>
</file>
<image:file filename = smiley.jpg>
<image:description> Smiling Face </image:description>
<image:size width = 200 height = 100 />
</image:file>
</directory>

Element file uses the namespace prefix image to indicate that this element is in the
urn:bca4:Imageinfo namespace, not the default namespace.

Building blocks of XML Document


The building blocks of XML are : Elements and Attributes.
XML Elements:
An element describes the data that it contains. Elements can contain other
elements, text and attributes. When an element definition consists of additional
elements or attributes, it is a complex type. A basic element definition consists of a
name and a data type. The following example how to define an element named
quantity with integer data type.
<xs element name = quantity type = xs:integer />
The instance of above code is:
<quantity> 55 </quantity> //valid
<quantity> Fifty-five </quantity> // invalid instance
The two tags (start and end tag), taken together along with the content between
them constitute an XML element.
One of the beauties of XML, is that it can be extended without breaking
applications.
Elements are referred by their name or element types.
Eg, <name> Ruchi Kawatra </name> //name used in start/end tag pair
However, the actual element instance is both tags and elements content nested
between the tags. Elements can have text content, which is called Parsed
Character Data or PCDATA, or they can have other elements as their content. For
eg, we can alter the name element to contain more information:
<name>
<first> Ruchi </first>
<last> Kawatra </last>
</name>
In the above eg, we have 3 elements a name element, which has as its content the
first element and the last element. The first and last elements contain PCDATA,
which represents the actual name of the person being stored in the name element.
An empty element can be shown in two ways:
<name> </name>
or <name />

XML Attribute:
An attribute is a named simple-type definition that contain other elements.
Attributes can also be assigned an optional default value and they must appear at
the bottom of complex-type definitions. If multiple attributes are declared, they may
occur in any order.
The below code has the attribute customer discount (CustDiscount)

<xs:element name=CustInfo">
<xs:complexType>
<xs:sequence>
<xs:element name="CustomerName" type="xs:string" />
<xs:element name=CustOrderNumber" type="xs:positiveInteger" />
<xs:element name="OrderTotal" type="xs:number" />
</xs:sequence>
<xs:attribute name=CustDiscount" type="xs:number" />
</xs:complexType>
</xs:element>

XML Tags
Tags are used to markup elements. A starting tag like <element-name> markup the
beginning of an element and ending tag like </element-name> markup the end of an
element.
Entities
Entities are variables used to define common text. Entity reference are references to
entities. Entities are expanded when a document is parsed by an XML parser.
Following entities are performed in XML
Entity reference Character
&lt; <
&gt; >
&amp; &
&quot;
&apos;

XML Components
A typical XML system consists of three types of files:
XML data is your data, plus XML tags that describe the meaning and structure of
your data.
XML schemas define rules for what can and cannot reside in your data files. For
example, a schema could ensure that users can't enter words into a date field.
XML transforms enable the use of data in a variety of programs or files. For
example, one transform could add sales data to a workbook, while another
transform could insert the same data into a document.

Some other components are


XML Base
Override the default URI of a document or any part of a document starting at a
given element.
Stylesheets in XML
Associate an XSLT transformation with an XML document, for example so that a
Web browser will format it.
XLink
A vocabulary for hypertext in XML.
xml:id
Identify an XML attribute or element as containing a name that can be used as a
unique identifier within a document.
XInclude
Include all or part of other text or XML documents, or duplicate part of the current
XML document.
XPointer
This is a framework for different ways to point into XML documents, and is used
by Xlink.
XForms
A more powerful cousin to HTML forms.
XML Events, XHTML Modularization
Specifications primarily relating to the use of XML in Web browsers or other
DOM-based systems
XML Fragments
Listed here only for completeness; this specification is not in widespread use.

XML Parser
An XML parser converts an XML document into an XML DOM object which can
be manipulated with JavaScript.
All modern browsers have a built-in XML parser.
The following code parses an XML document into an XML DOM object:
if (window.XMLHttpRequest)
{ xmlhttp = new XMLHttpRequest( );
}
else
{ xmlhttp = new ActiveXObject(Microsoft.XMLHTTP);
}
xmlhttp.open(GET,books.xml,false);
xmlhttp.send();
xmlDoc = xmlhttp.responseXML;

The following code fragment parses an XML string into an XML DOM object:
txt = <bookstore> <book>;
txt = txt + <title> Internet & World Wide </title>;
txt = txt + <author> H. M. Deitel </author>;
txt = txt + <year> 2006 </year>;
txt = txt + </book> </bookstore>;
if (window.DOMParser)
{ parser = new DOMParser();
xmlDoc = parser.parseFromString(txt,text/xml);
}
else
{ xmlDoc = new ActiveXObject(Microsoft.XMLDOM);
xmlDoc.async = false;
xmlDoc.loadXML(txt);
}
XML Parser consists of an application program interface, a set of routines,
protocols and tools for building software applications. A good API makes it easier
to develop a program by providing all the building blocks. A programmer then puts
the blocks together.
Parser form the function of any XML processing program. They provide a way to
access data in an XML document. Parsers add value to XML programming. Without
Parser, XML documents are static text.
Parser provides life to the documents, enabling a program to take action on the data
held with the structure.
XML parser called the XML Processor determines the content and structure of an
XML document by combining an XML document and its DTD.
XML parser builds tree structure from XML documents. XML parser exist in two
varieties Validating (that enforces DTD rules) and non-validating (that ignores
DTD rules).
XML XML
XML XML DTD
+ Parser application
Document ( optional)

XML provides an API for a program to accept pieces of an XML document.


Two standard APIs available are:
- DOM (Document Object Model)
- SAX (Simple API for XML)
Parser are the action behind XML and forms the foundation of almost any XML
related program.
Parser provides a programmer an API for interacting with XML documents.
API define what an implementation is to follow as long as you write your
application to a particular API, any parser that implements that API will provide the
desired result.
Parser do the following:
Validation
Assist with well formed checking
Building of a document tree (DOM only)
Checking the application for errors

Most parsers can handle files and streams.


Parsers ensure that a document meets all the basic requirements of well-formed
XML, such as naming convention, hierarchical structure of elements, string quoting
and entity expansion.
Parser provides a programmer an API for interacting with XML documents.
API define what an implementation is to follow as long as you write your
application to a particular API, any parser that implements that API will provide the
desired result.
Parser performs the following basic functions:
Read the document.
Enforce XML syntax (well formed )
Enforce document Schema (validating)
Encode translation.
Provide methods for managing the XML document.
Document Type Definition
A DTD enables an XML parser to verify whether an XML document is valid (i.e,
its elements contain the proper attributes in proper sequence).
DTD allows independent user groups to check document structure and to exchange
data in a standardized format. A DTD express the set of rules for document structure
using an EBNF (Extended Backus-Naur Form) grammar.
DTD has two types of declarations : element type declaration and attribute type
declaration.
An element type declaration defines three characteristics:
The element types name also called generic identifier.
Whether start and end tags are required, are forbidden or may be omitted.
The element type s content model or what content it can enclose.
The elements type declarations begin with the keyword ELEMENT.

Code of DTD document letter.xml


<! ELEMENT letter (contact+, salutation, paragraph+, closing, signature) >
<! ELEMENT contact (name, adress1, address2, city, state, zip, phone, flag) >
<!ATTLIST contact type CDATA # IMPLIED>
<! ELEMENT name (#PCDATA) >
<! ELEMENT address1 (#PCDATA) >
<! ELEMENT address2 ( #PCDATA) >
<! ELEMENT city ( #PCDATA ) >
<! ELEMENT state ( #PCDATA ) >
<! ELEMENT zip ( #PCDATA ) >
<! ELEMENT phone ( #PCDATA ) >
<! ELEMENT flag EMPTY >
<! ELEMENT flag gender (M | F) >
<! ELEMENT salutation ( #PCDATA ) >
<! ELEMENT closing ( #PCDATA ) >
<! ELEMENT paragraph ( #PCDATA ) >
<! ELEMENT signature ( #PCDATA ) >

The elements type declarations begin with the keyword ELEMENT.


<!ELEMENT name content_model >
In the above code, ELEMENT of element type declaration defines the rules for the
element letter. Letter contains one or more contact elements, one salutation element,
one or more paragraph elements, one closing element and one signature element.
The empty element can be given as
<! ELEMENT br EMPTY>
In traditional DTD,
<ELEMENT BR 0 EMPTY >
Tag minimization is declared by two parameters that indicate the start and end tags.
These parameters may take one of two values. A hyphen indicates that a tag is
required \. An uppercase O indicates it may be omitted. The combination of O for
the end tag and the content model EMPTY means the end tag is forbidden.
Most HTML and XHTML elements enclose content. If a content model is declared,
it is enclosed within parentheses and known as a model group.
<ELEMENT OPTION O (# PCDATA) >
In above code, the contact element specifies that element contact contains child
elements name, address1, address2, city, state, zip, phone and flag. The DTD
requires exactly one occurrence of each of these elements.

OCCURRENCE INDICATOR
It is a special symbol that qualifies the element type or model group to which it is
appended, indicating how many times it occurs.
The plus sign (+) occurrence indicator specifies that the DTD allows one or more
occurrences of an element (at least one)
The asterisk (*), which indicates an optional element that can occur any number of
times.
The question mark (?), which indicates an optional element that can occur at most
once. If an element does not have an occurrence indicator, the DTD allows exactly
one occurrence.

Logical Connector
It is a special symbol indicating how the content units it connects relate to each
other. There are three logical connectors and one grouping connector.
| means or -> one and only one of the connected content units must occur.
& means and -> all of the connected content units must occur.
, means sequence -> the connected content units must occur in specified order.
( ) -> used to group content units together.
Eg, <! ELEMENT dl (dt| dd) +>
Thus, the content model for a definition list says that the <dl> tag must contain
either a <dt> or a <dl> tag and can contain any additional number of <dt> or <dd>
tags.

Attribute Declaration
All attribute declarations begin with the keyword ATTLIST, followed by the
element name, attribute type, and default data inforamation.
<! ATTLIST element-name attribute-name attribute-type default-data HTML is
similar, as shown below

<!ATTLIST bdo
%coreattrs;
%events;
lang %LanguageCode; #IMPLIED
xml : lang %LanguageCode; #IMPLIED
dir (ltr | rtl) #REQUIRED
>
The commonly repeated attributes and values for HTML & XHTML can be minimized
with parameter entities like %coreattrs, which expand to id, class, style and title
attributes.
Keyword #IMPLIED specifies that if the parser finds a contact element without a
type attribute, it can choose an arbitrary value for the attribute or ignore the
attribute, and the document will be valid.
Other types of default values includes #REQUIRED and #FIXED. Keyword
#REQUIRED specifies that the attribute must be present in the element, and
keyword #FIXED specifies that the attribute must have the given fixed value.
Eg, <!ATTLIST address zip #FIXED 110001>
By default the attribute zip must have the value 110001 for the document to be
valid.

XML Keywords
Keyword CDATA specifies that attribute type contains unparsed character data,
which indicates that the parser will not process the data, but will pass the data to the
application without modification.
Keyword #PCDATA specifies that the element can contain parsed character data
(i.e text). Parsable character data should not contain markup characters, such as less
than (<), greater than (>) and ampersand (&). The document author should replace
any markup character with its corresponding entity like, &lt; , &gt; or &amp; .
Keyword EMPTY specifies that the element does not contain any data. Attributes
commonly contain data that the empty element describes. Eg, gender attribute of
empty element flag.
ID refers to a document-wide unique identifier.
IDREF specifies a reference to a document-wide identifier.
NAME specifies an alphabetic character string plus a hyphen and a period.
NUMBER specifies a character string containing decimal numbers.
NMTOKEN specifies an alphanumeric character string plus a hyphen and period.
Parameter Entities
An entity is a macro that allows a short name to be associated with replacement
text. Parameter entities define replacement text used in DTD declarations.
Syntactically a parameter entity is distinguished by using percent (%) symbol.
<! ENTITY % name replacement text >
It is used in DTDs as follows
<! ENTITY % coreattrs
id ID #IMPLIED
class CDATA #IMPLIED
style %StyleSheet; #IMPLIED
title %Text; #IMPLIED
>
A default value must be given in double quotes.
Eg, enctype %ContentType; application/x-www-form-urlencoded
Comments
DTDs contains comment same as HTML
<! this is a comment - ->

INTERNAL DTD
Within the body of the document type declaration we can declare all the elements
and their attributes. All the DTD declaration are placed between the square
brackets.
<?xml version = 1.0?>
<!DOCTYPE roottag [
<!ELEMENT roottag
(to,from,headding) >
<!ELEMENT to (#PCDATA) >
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA) >
]>
<roottag>
<to> Hello </to>
<from> Dear </from>
<heading> XML </heading>
</roottag>

EXTERNAL DTD
A document type definition consists of the declaration within the doctype
declaration which include entity, elements and their attribute declaration. In this we
dont have to copy all the documents type declaration in our document. So within
each document we can refer to external data file using the following syntax.
<!DOCTYPE roottag SYSTEM filenam.dtd>
Eg, <?xml version = 1.0>
<!DOCTYPE roottag SYSTEM abc.dtd>
<roottag>
<to> Hello </to>
<from> Dear </from>
<heading> XML </heading>
</roottag>
</xml>
Save file as abc.xml
Create another file as abc.dtd
<!ELEMENT rootag (to,from, heading)>
<!ELEMENT to (#PCDATA) >
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA) >

Why use a DTD?


XML provides an application independent way of sharing data. With DTD,
independent groups of people can agree to use a common DTD for interchanging
data. It can be used to verify the data receive from the outside world is valid. We
can also use DTD to verify our own data.

XML Schema
An XML Schema describes the structure of an XML document. An XML schema is
a description of a type of XML document, typically expressed in terms of
constraints on the structure and content of documents of that type, above and
beyond the basic syntactical constraints imposed by XML itself. These constraints
are generally expressed using some combination of grammatical rules governing the
order of elements, Boolean predicates that the content must satisfy, data types
governing the content of elements and attributes, and more specialized rules such as
uniqueness and referential integrity constraints.

There are languages developed specifically to express XML schemas. The


Document Type Definition (DTD) language, which is native to the XML
specification, is a schema language that is of relatively limited capability, but that
also has other uses in XML aside from the expression of schemas. Two more
expressive XML schema languages in widespread use are XML Schema (capital S)
and RELAX NG.
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/XMLSchema">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
XML Schema Documents
Programs cannot manipulate DTDs in the same manner as XML documents. These
and other limitations have led to the development of schemas.
Unlike DTD, schema do not use EBNF grammar. Instead, schemas use XML syntax
and are actually XML documents that programs can manipulate.
Like, DTDs schemas require validating parsers.
A DTD describes an XML documents structure, not the content of its elements. Eg,
<quantity> 5 </quantity> contains character data.
If the document that contains quantity references a DTD, an XML parser can
validate the document to confirm that this element contain PCDATA content, but
the parser cannot validate that the content is numeric. DTDs do not provide such
capability. The parser also consider markup such as
<quantity> Great </quantity> to be valid.
XML Schema enables schema authors to specify that element quantitys data must
be numeric. In validating the XML document against this schema, the parser can
determine that 5 conforms and Great does not. An XML document that conforms to
a schema document is schema valid and a document that does not conforms is
invalid

Code for book.xml


<deitel:books xmlns:deitel = http://www.deitel.com/booklist>
<book>
<title> XML How to Program </title>
</book>
<book>
<title> C How to Program </title>
</book>
<book>
<title> C++ How to Program </title>
</book>
<book>
<title> Perl How to Program </title>
</book>
</deitel:books>
Code of book.xsd -> XML schema document
<schema xmlns = http://www.w3.org/2001/XML Schema
xmlns:deitel = http://www.deitel.com/booklist
targetNamespace = http://www.deitel.com/booklist >
< element name = books type = deitel:BooksType />
< complexType name = BooksType >
<sequence>
<element name = book type = deitel:SingleBookType minOccurs = 1
maxOccurs = unbouded />
</sequence>
</complexType>
< complexType name = SingleBookType >
<sequence>
<element name = title type = string />
</sequence>
</complexType>
</schema>

There are two program, the first shows a schema-valid XML document named
book.xml and the second shows the pertinent XML schema named book.xsd that
defines the structure for book.xml.
Although schema authors can use virtually any filename extension, schemas
commonly use .xsd extension.
In first code, the books element must have the namespace prefix deitel because the
books element is a part of http://www.deitel.com/booklist namespace.
XML Schema documents always use the standard namespace URI
http://www.w3.org/2001/XMLSchema.
Root element schema contains elements that define the XML document structure. In
second code, the line 3 binds the URI http://www.deitel.com/booklist to namespace
prefix deitel. Line 4 specifies the targetNamespace, which is the namespace of the
XML vocabulary that this schema defines.
The element tag defines an element to be included in the XML document structure.
Element specifies the actual elements that can be used to mark up data. Attributes
name and type specify the elements name and data type respectively.

Possible data types include XML Schema-defined types eg, string, double and user-
defined types eg, BooksType.
XML Schema provides a large number of built-In simple types, such as date for
dates, int for integer, double for floating-point numbers and time for times. An
elements data type indicates the data that the element may contain.
Two categories of data types in XML Schema are: simple and complex types.
Simple types cannot contain attributes or child elements whereas complex type can.
In above example, books is defined as an element of data type deitel:BooksType.
BooksType is a user-defined type in the http://www.deitel.com/booklist namespace
and therefore must have the namespace prefix deitel.
We have also used element complexType to define BooksType as a complex type
that has child element named book.
The sequence element allows a programmer to specify the sequential order in which
child element must appear.
Attribute minOccurs = 1 specifies that elements of type BooksType must contain
a minimum of one book element.
Attribute maxOccurs, with value unbounded specifies that elements of type
BooksType may have any number of book child element.
There is a complex type called SingleBookType. In which we define element title to
be of simple type String.
The closing schema tag declares the end of XML Schema document.
Every simple type defines a restriction on a built-in schema data type eg, string,
float, time or a restriction of a user-defined data type. A restriction limits the
possible values an element can hold.
Complex types are divided in two groups. Complex types can have either simple
content or complex content. Both simple content and complex content can contain
attributes, but only complex content can have child elements.
Complex types with simple content must be extended or restricted, complex types
with complex content do not have this limitation.

Mathematical Markup Language


XML allows authors to create their own tags to describe data precisely. Many
people and organizations in various fields of study have created many different
kinds of XML for structuring data. Some of these markup languages are: MathML,
SVG, WML, XUL, PDML.
MathML, is the one which the W3C developed for describing mathematical
notations and expressions.
One application that can parse and render MathML is the W3Cs Amaya
browser/editor, which can be downloaded from
www.w3.org/Amaya/User/BinDist.html
MathML markup describes mathematical expressions for display. MathML.
MathML is divided into two types of markup: content markup and presentation
markup.
Content markup provides tags that embody mathematical concepts.
Content MathML allows programmers to write mathematical notations specific to
different areas of mathematics. For eg, multiplication has one meaning in set theory
and another in linear algebra.
Programmers can take content MathML markup, discern mathematical context and
evaluate the marked up mathematical operations.
Presentation MathML is directed towards formatting and displaying mathematical
notations.
<? Xml version = 1.0 encoding=iso-8859-1?> // code 1
<!DOCTYPE math PUBLIC -//W3C//DTD MathML 2.0//EN
http://www.w3.org/TR/MathML2/dtd/mathml2.dtd>
<math xmlns = http://www.w3.org/1998/Math/MathML>
<mrow>
<mn> 2 </mn>
<mo> + </mo>
<mn> 3 </mn>
<mo> = </mo>
<mn> 5 </mn>
</mrow>
</math>
<? xml version = 1.0 ?> <!-- Code 2 -- >
<html xmlns = http://www.w3.org/1999/xhtml>
<head> <title> Calculus MathML Example </title> </head>
<body>
<math xmlns = http://www.w3.org/1998/Math/MathML>
<mrow>
<msubsup>
<mo>&Integral;</mo>
<mn>0</mn>
<mrow>
<mn> 1 </mn>
<mo> - </mo>
<mi> y </mi>
</mrow>
</msubsup>

<msqrt>
<mrow>
<mn> 4 </mn>
<mo>&InvisibleTimes; </mo>
<msup>
<mi> x </mi>
<mn> 2 </mn>
</msup>
<mo> + </mo>
<mi> y </mi>
</mrow>
</msqrt>
<mo> &delta; </mo>
<mi> x </mi>
</mrow>
</math>
</body>
</html>
The mrow element is a container element for expressions that contain more than
one element. In above code, the mrow element containd five children.
The mn element marks up a number.
The mo element marks up an operator.
The entity reference &InvisibleTimes; indicates a multiplication operation without
explicit symbolic representation i.e, no multiplication sign between 4 and x.
The msup element represents a superscript.
The msub element represents a subscript.
To display variables such as x and y use identifier element mi.
To display fraction use element mfrac.
The entity &Integral; represents the integral symbol.
The element msubsup specifies the superscript and subscript of integral symbol.
Element msqrt represents a square root expression.
The entity &delta; represents a lowercase delta symbol. Delta is an operator.
Displaying XML using Style Sheets
There are tow types of style sheets for displaying XML Cascading Style Sheets
(CSS) and eXtensible Style Language (XSL).
A Style Sheet is a set of instructions that tells a browser how to display a particular
type of XML element.
A CSS without an XML document is like a beautiful container without any content
in it.
code, <?xml version = 1.0 ?>
<!DOCTYPE play SYSTEM play.dtd>
<?xml-stylesheet type = text/css href = play.css ?>
<play>
<fm> Radio 92.5 </fm>
<title> Mirchi </title>
</play>
Save file as play.xml and run on Internet Explorer.
In the above document to call the CSS, we can also give
@import url(play.css)

File play.css
Play
{display:block;
font-family: Arial;
color: green;
text-align: center}
Fm
{ display:block;
font-family: Times New Roman;
color: blue;
text-align: left}
Title
{display:block;
font-family: Sans Serif;
color: yellow;
text-align: right}

File play.dtd
<!ELEMENT play (title,fm) >
<!ELEMENT title (#PCDATA)>
<!ELEMENT fm (#PCDATA)>

Block-level elements are separated from other block-level elements, generally by


breaking line. In HTML, <p>, <blockquote>, <H1>, <hr> are all examples of
block-level element. The keyword to utilize this feature is display:block.
The display property tells the rendering engine how to display the element. It also
keeps the provision for not displaying an element which might be a part of XML
document.

Embedding XML in html file say hello.html


<html>
<body>
<xml id = xmldoc>
<articles>
<article>
<title> XML </title>
<date> Apr 30 </date>
</article>
<article>
<title> Java Script </title>
<date> May 1 </date>
</article>
</articles>
</xml>
<table border = 1 datasrc = #xmldoc >
<thead>

<tr>
<th> Sample </th>
<th> Date </th>
</tr>
</thead>
<tr>
<td> <SPAN DATAFLD = Title> </SPAN> </td>
<td> <SPAN DATAFLD = Date> </SPAN> </td>
</tr>
</table>
</body>
</html>

Вам также может понравиться