Вы находитесь на странице: 1из 64

Database management systems (DBMS)

1.1. What is a Database?

A brief definition might be:

Let us examine the parts of this definition in more detail.

1.1.1. A store of information .....
Typical examples of information stored for some practical purpose are:

Information collected for the sake of making a statistical analysis, e.g. the national
census, or a survey of cracks in a stretch of motorway.
Textual material required for information retrieval e.g. technical abstracts, statutory or
other regulations. Currently there is some interest in extending such data bases in the
direction of intelligent knowledge-based systems (IKBS) where rules for interpretation
by an expert user are included along with the information itself.
Operational and administrative information required for running an organisation. In a
commercial concern this will take the form of stock records, personnel records,
customer records, among others.

The three main examples given here are of STATIC, GROWING and DYNAMIC
databases respectively.

Note that nothing was said in the definition about the quantity of information being
held. Although many of the benefits associated with using a database are due to
economies of scale, a small database may be very worthwhile (for instance to the
secretary of the local sports club) if the information is to be processed frequently and in
a repetitive manner.

1.1.2. ..... held over a period of time.

This part of the definition goes without saying in most people's minds but it is worth
dwelling on it for a minute. Because of the investment involved in setting up a
database, the expectation must be that it will continue to be useful, over years rather
than months. But the relationship with time varies from one type of information to

Census information is collected on a particular date and stored as a snapshot of the state
of affairs when the survey was taken. Information from later observations will be kept
quite separately, but appropriate comparisons may be made provided that the
framework remains consistent.
Bibliographic or other textual databases are accumulated over time - new material is
added periodically but probably very little will be removed. When designing such a
database it will be important to estimate and allow for the expected rate of growth, and
perhaps to ensure that the more recent information is given some priority.
An organisational database may not change very drastically in size, but it will be
subject to frequent updating (deletions, amendments, insertions) following relevant
actions within the organisation itself. Ensuring the accuracy, efficiency and security of
this process is the main concern of many database designers and administrators.

1.1.3. ..... in computer-readable form.

Information (often referred to in this context as data) has been processed by computer
for over 30 years, using a variety of storage media. Some form of magnetic disc is
likely to be used, since discs currently provide the most cost-effective way of holding
large quantities of data while allowing fast access to any individual item. Other
methods are obviously under development, notably optical storage - CD Rom - which
as yet does not give enough scope for updating in most database applications.

Database handling techniques grew out of earlier and simpler file processing
techniques. A file consists of an ordered collection of records; a database consists of
two or more related files which we may wish to process together in various different
ways. It will store not only the individual records containing the numbers or words
needed for some application, but auxiliary information which will allow those records
to be accessed more quickly, or which will link related records or data items together. A
database designer may be required to choose how much and what sort of auxiliary
information to store, using his knowledge of how the database will be used.

Computer storage and processing implies the use of software: in the current context a
store and retrieve information as required by applications programs or users sitting at
terminals, using the facilities provided by the computer operating system. It is one of a
number of software layers making computer facilities available to users with perhaps
comparatively little technical expertise.

1.2. Summary of DBMS functions.

1.2.1. Data definition.

This includes describing:

RELATIONSHIPS between records of different types
Extra information to make searching efficient, e.g. INDEXES.

1.2.2. Data entry and validation.

Validation may include:


In an interactive data entry system, errors should be detected immediately - some can
be prevented altogether by keyboard monitoring - and recovery and re-entry permitted.

1.2.3. Updating.

Updating involves:

At the same time any back-ground data such as indexes or pointers from one record to
another must be changed to maintain consistency. Updating may take place
interactively, or by submission of a file of transaction records; handling these may
require a program of some kind to be written, either in a conventional programming
language (a host language, e.g. COBOL or C) or in a language supplied by the DBMS
for constructing command files.

1.2.4. Data retrieval on the basis of selection criteria.

For this purpose most systems provide a QUERY LANGUAGE with which the
characteristics of the required records may be specified. Query languages differ
enormously in power and sophistication but a standard which is becoming increasingly
common is based on the so-called RELATIONAL operations. These allow:

selection of records on the basis of particular field values.

selection of particular fields from records to be displayed.
linking together records from two different files on the basis of matching field values.

Arbitrary combinations of these operators on the files making up a database can answer
a very large number of queries without requiring users to go into one record at a time

1.2.5. Report definition.

Most systems provide facilities for describing how summary reports from the database
are to be created and laid out on paper. These may include obtaining:

over particular CONTROL FIELDS. Also specification of PAGE and LINE LAYOUT,
HEADINGS, PAGE-NUMBERING, and other narrative to make the report

1.2.6. Security.

This has several aspects:

Ensuring that only those authorised to do so can see and modify the data, generally by
some extension of the password principle.
Ensuring the consistency of the database where many users are accessing and up-dating
it simultaneously.
Ensuring the existence and INTEGRITY of the database after hardware or software
failure. At the very least this involves making provision for back-up and re-loading.

1.3. Why have a database (and a DBMS)?

An organisation uses a computer to store and process information because it hopes for
speed, accuracy, efficiency, economy etc. beyond what could be achieved using clerical
methods. The objectives of using a DBMS must in essence be the same although the
justifications may be more indirect.

Early computer applications were based on existing clerical methods and stored
information was partitioned in much the same way as manual files. But the computer's
processing speed gave a potential for RELATING data from different sources to
produce valuable manage-ment information, provided that some standardisation could
be imposed over departmental boundaries. The idea emerged of the integrated database
as a central resource. Data is captured as close as possible to its point of origin and
transmitted to the database, then extracted by anyone within the organisation who
requires it. However many provisos have become attached to this idea in practice, it
still provides possibly the strongest motivation for the introduction of a DBMS in large
organisations. The idea is that any piece of information is entered and stored just once,
eliminating duplications of effort and the possibility of inconsistency between different
departmental records.

Other advantages relate to the task of running a conventional Data Processing (DP)
department. Organisational requirements change over time, and applications programs
laboriously developed need to be periodically adjusted. A DBMS gives some protection
against change by taking care of basic storage and retrieval functions in a standard way,
leaving the applications developer to concentrate on specific organisational
requirements. Changes in one of these areas need not have repercussions elsewhere. In
general a DBMS is a substantial piece of software, the result of many man-years of
effort. Because its development costs are spread over a number of purchasers it can
probably provide more facilities than would be economic in a one-off product.

The points discussed above are probably most relevant to the larger organisation using
a DBMS for its administrative functions - the environment in which the idea of
databases first originated. In other contexts the convenience of a DBMS may be the
primary consideration. The purchaser of a small business computer needs all the
software to run it in package form, written so that the minimum of expertise is required
to use it. The same applies to departments (e.g. Research & Development) with special
needs which cannot be satisfied by a large centralised system. When comparing
database management systems it is obvious that some are designed in the expectation
that professional DP staff will be available to run them, while others are aimed at the
total novice.

There are of course costs associated with adopting a DBMS. Actual monetary costs
vary widely from, for instance, a large multi-user Oracle system to a small PC-based
filing system. In the first case the charge will cover support, some training, extensive
documentation and the provision of periodical upgrades to the software; in the second
case the purchaser will be on his own with the manual. But there is also a tendency for
the cost of software to reflect the cost of the hardware on which it is run!

Probably the main cost associated with acquiring a DBMS is due to the work involved
in designing and implementing systems to use it. In order to provide a general and
powerful set of facilities for its users any DBMS imposes restraints on the way
information can be described and accessed, and demands familiarity with the DATA
MODEL which it supports and the command language which it provides to define and
manipulate data. Data models still in use are HIERARCHICAL (tree-structured),
NETWORK and RELATIONAL (tabular). Of these the last is the current favourite,
providing a good basis for high-level query languages and giving scope for the
exploitation of special-purpose hardware in efficient large-scale data handling.

This course will concentrate on the RELATIONAL model.

1.4. Data Base Project Development.

The conventional SYSTEMS LIFE CYCLE consists of:

In practice these phases are not always sharply distinguished; for small projects it may
not be necessary to go formally through every one. The move from one phase to the
next is essentially a move from the general to the specific. At each stage, particularly
where a DBMS is involved, we shall be concerned both with information and with
processes to be performed using that information.
1.4.1. Analysis
The outputs from this stage should be:

A CONCEPTUAL DATA MODEL describing the information which is used within the
organisation but not in computer-related terms. This level of data analysis will be
considered in more detail later. One of the problems with any systems design in a large
organisation is that it must proceed in a piecemeal manner - it is impossible to create a
totally new GLOBAL system in one fell swoop, and each sub-system must dovetail
with others which may be at quite a different stage of development. The conceptual
data model provides a context within which more detailed design specifications can be
produced, and should help in maintaining consistency from one application area to
A CONCEPTUAL PROCESS MODEL describing the functions of the organisation in terms of
events (e.g. a purchase, a payment, a booking) and the processes which must be performed within
the organisation to handle them. This may lead to a more detailed functional specification -
describing the organisational requirements which must be satisfied, but not how they are to be

1.4.2. Design
This stage should produce:

A LOGICAL DATA MODEL: a description of the data to be stored in the database, using
the conventions prescribed by the particular DBMS to be used. This is sometimes
referred to as a SCHEMA and some DBMSs also give facilities for defining SUB-
SCHEMA or partitions of the overall schema. Logical data models supported by
present day DBMSs will be considered later.

A SYSTEM SPECIFICATION, describing in some detail what the proposed system should
do. This will now refer to COMPUTER PROCESSES, but probably in terms of INPUT
and OUTPUT MESSAGES rather than internal logic, describing, for instance, the
effect of selecting an item from a menu, or any option within a command driven
system. Program modules are defined in terms of the screen displays and/or reports
which they generate. Note that the data referred to here has a temporary existence, in
contrast with what is stored in the database itself.

1.4.3. Development.
Specification of the database itself must now come down another level, to decisions
about PHYSICAL DATA STORAGE in particular files on particular devices. For this a
knowledge of the computer operating system, as well as the DBMS, is required.
Conventional program development - coding, testing, debugging etc. may also be done.
If a totally packaged system has been purchased this may not be necessary - it will
simply be a matter of discovering how to use the command and query language already
supplied to store and retrieve data, generate reports and other outputs. Even here an
element of testing and debugging may be involved, since it is unlikely that the new user
of a system will get it exactly right the first time. It is certainly inadvisable for this sort
of experimentation to take place using a live database!

1.4.4. Implementation.
This puts the work of the previous three phases into everyday use. It involves such
things as loading the database with live rather than test data, staff training, probably the
introduction of new working practices. It is not unusual to have an old and a new
system running side by side for a while so that some back-up is available if the new
system fails unexpectedly.

1.4.5. Maintenance.
Systems once implemented generally require further work done on them as time goes
by, either to correct original design faults or to accommodate changes in user
requirements or operating constraints. One of the objectives of using a DBMS is to
reduce the impact of such changes - for example the data can be physically re-arranged
without affecting the logic of the programs which use it. Some DBMSs provide utility
programs to re-organise the data when either its physical or logical design must be


The relational model.

The relational model consists of three components:

1. A Structural component -- a set of TABLES (also called

2. MANIPULATIVE component consisting of a set of high-level
operations which act upon and produce whole tables.
3. A SET OF RULES for maintaining the INTEGRITY of the

The terminology associated with relational database theory originates

from the branch of mathematics called set theory although there are
widely used synonyms for these precise, mathematical terms.

Data structures are composed of two components which represent a

model of the situation being considered. These are (i) ENTITY
TYPES - i.e. data group types, and (ii) the RELATIONSHIPS
between the entity types.
Entity types are represented by RELATIONS or BASE TABLES.
These two terms are interchangeable - a RELATION is the
mathematical term for a TABLE.
A base table is loosely defined as an un-ordered collection of zero,
one or more TUPLES (ROWS) each of which consists of one or more
un-ordered ATTRIBUTES (COLUMNS). All tuples are made up of
exactly the same set of attributes.

For the remainder of this discussion we shall use the more widely
known terminology:


Each column is drawn from a DOMAIN, that is, a set of values from
which the actual values are taken (e.g. a set of car model names).
More than one column in a table may draw its values from the same
A column entry in any row is SINGLE-VALUED, i.e. it contains
exactly one item only (e.g. a surname). Repeating groups, i.e.
columns which contain sets of values rather than a single value, not
Each row of a table is uniquely identified by a PRIMARY KEY
composed of one or more columns. This implies that a table may not
contain duplicate rows.
Note that, in general, a column, or group of columns, that uniquely
identifies a row in a table is called a CANDIDATE KEY. There may
be more than one candidate key for a particular table; one of these
will be chosen as the primary key.
The ENTITY INTEGRITY RULE of the model states that no
component of the primary key may contain a NULL value.
A column, or combination of columns, that matches the primary key
of another table is called a FOREIGN KEY.
The REFERENTIAL INTEGRITY RULE of the model states that, for
every foreign key value in a table there must be a corresponding
primary key value in another table in the database.
Only two kinds of table may be defined in a SQL schema; BASE
Other tables, UNNAMED RELATIONS, may be derived from these
by means of relational operations such as JOINS and
All tables are LOGICAL ENTITIES. Of these only base tables
physically exist in that there exist physically stored records, and
possible physical access paths such as indexes, in one or more stored
files, that directly support the table in physical storage. Although
standard techniques such as HASHING, INDEXING, etc. will be
used for implementation efficiency, the user of the data base should
require no knowledge of previously defined access paths.
Views and the results of all operations on tables - unnamed relations -
are tables that exist as LOGICAL DEFINITIONS, in terms of a view
definition, or a [SELECT .. FROM .. WHERE .. ORDER BY]
The term UPDATE has two meanings:
as a SQL operation in its own right which causes one or more
columns in a table to be altered; in this context it will always be
shown in upper-case letters - UPDATE.
as a generic term used to include the SQL operations INSERT,
DELETE and UPDATE; in this context it will always be shown in
lower-case letters.
Data Base Management System - DBMS

What is DBMS? what are different types of DBMS? Compare 3 types of DBMS.

A database management system (DBMS) is computer software designed for the purpose of managing
databases. Typical examples of DBMSs include Oracle, DB2, Microsoft Access, Microsoft SQL Server,
PostgreSQL, MySQL, FileMaker and Sybase Adaptive Server Enterprise. DBMSs are typically used by
Database administrators in the creation of Database systems.

A DBMS is a complex set of software programs that controls the organization, storage, management, and
retrieval of data in a database. A DBMS includes:

1. A modeling language to define the schema of each database hosted in the DBMS, according to the
DBMS data model.

The four most common types of organizations are the hierarchical, network, relational and object
models. Inverted lists and other methods are also used.

A given database management system may provide one or more of the four models. The optimal
structure depends on the natural organization of the application's data, and on the application's
requirements (which include transaction rate (speed), reliability, maintainability, scalability, and cost).

The dominant model in use today is the ad hoc one embedded in SQL, despite the objections of purists
who believe this model is a corruption of the relational model, since it violates several of its fundamental
principles for the sake of practicality and performance. Many DBMSs also support the Open Database
Connectivity API that supports a standard way for programmers to access the DBMS.

2. Data structures (fields, records, files and objects) optimized to deal with very large amounts of data
stored on a permanent data storage device (which implies relatively slow access compared to volatile
main memory).

3. A database query language and report writer to allow users to interactively interrogate the database,
analyze its data and update it according to the users privileges on data.

It also controls the security of the database.

Data security prevents unauthorized users from viewing or updating the database. Using passwords, users
are allowed access to the entire database or subsets of it called subschemas. For example, an employee
database can contain all the data about an individual employee, but one group of users may be authorized
to view only payroll data, while others are allowed access to only work history and medical data.

If the DBMS provides a way to interactively enter and update the database, as well as interrogate it, this
capability allows for managing personal databases. However, it may not leave an audit trail of actions or
provide the kinds of controls necessary in a multi-user organization. These controls are only available
when a set of application programs are customized for each data entry and updating function.

4. A transaction mechanism, that ideally would guarantee the ACID properties, in order to ensure data
integrity, despite concurrent user accesses (concurrency control), and faults (fault tolerance).
It also maintains the integrity of the data in the database.

The DBMS can maintain the integrity of the database by not allowing more than one user to update the
same record at the same time. The DBMS can help prevent duplicate records via unique index
constraints; for example, no two customers with the same customer numbers (key fields) can be entered
into the database. See ACID properties for more information (Redundancy avoidance).

The DBMS accepts requests for data from the application program and instructs the operating system to
transfer the appropriate data.

When a DBMS is used, information systems can be changed much more easily as the organization's
information requirements change. New categories of data can be added to the database without disruption
to the existing system.

Organizations may use one kind of DBMS for daily transaction processing and then move the detail onto
another computer that uses another DBMS better suited for random inquiries and analysis. Overall
systems design decisions are performed by data administrators and systems analysts. Detailed database
design is performed by database administrators.

Database servers are specially designed computers that hold the actual databases and run only the DBMS
and related software. Database servers are usually multiprocessor computers, with RAID disk arrays used
for stable storage. Connected to one or more servers via a high-speed channel, hardware database
accelerators are also used in large volume transaction processing environments.

DBMS's are found at the heart of most database applications. Sometimes DBMSs are built around a
private multitasking kernel with built-in networking support although nowadays these functions are left
to the operating system.

Features and Abilities Of DBMS

One can characterize a DBMS as an "attribute management system" where attributes are small chunks of
information that describe something. For example, "color" is an attribute of a car. The value of the
attribute may be a color such as "red", "blue", "silver", etc. Lately databases have been modified to
accept large or unstructured (pre-digested or pre-categorized) information as well, such as images and
text documents. However, the main focus is still on descriptive attributes.

DBMS roll together frequently-needed services or features of attribute management. This allows one to
get powerful functionality "out of the box" rather than program each from scratch or add and integrate
them incrementally. Such features include:

Query ability

Querying is the process of requesting attribute information from various perspectives and combinations
of factors. Example: "How many 2-door cars in Texas are green?"

A database query language and report writer to allow users to interactively interrogate the database,
analyze its data and update it according to the users privileges on data. It also controls the security of the
database. Data security prevents unauthorized users from viewing or updating the database. Using
passwords, users are allowed access to the entire database or subsets of it called subschemas. For
example, an employee database can contain all the data about an individual employee, but one group of
users may be authorized to view only payroll data, while others are allowed access to only work history
and medical data. If the DBMS provides a way to interactively enter and update the database, as well as
interrogate it, this capability allows for managing personal databases. However, it may not leave an audit
trail of actions or provide the kinds of controls necessary in a multi-user organization. These controls are
only available when a set of application programs are customized for each data entry and updating

Backup and replication

Copies of attributes need to be made regularly in case primary disks or other equipment fails. A periodic
copy of attributes may also be created for a distant organization that cannot readily access the original.
DBMS usually provide utilities to facilitate the process of extracting and disseminating attribute sets.

When data is replicated between database servers, so that the information remains consistent throughout
the database system and users cannot tell or even know which server in the DBMS they are using, the
system is said to exhibit replication transparency.

Rule enforcement

Often one wants to apply rules to attributes so that the attributes are clean and reliable. For example, we
may have a rule that says each car can have only one engine associated with it (identified by Engine
Number). If somebody tries to associate a second engine with a given car, we want the DBMS to deny
such a request and display an error message. However, with changes in the model specification such as,
in this example, hybrid gas-electric cars, rules may need to change. Ideally such rules should be able to
be added and removed as needed without significant data layout redesign.


Often it is desirable to limit who can see or change which attributes or groups of attributes. This may be
managed directly by individual, or by the assignment of individuals and privileges to groups, or (in the
most elaborate models) through the assignment of individuals and groups to roles which are then granted


There are common computations requested on attributes such as counting, summing, averaging, sorting,
grouping, cross-referencing, etc. Rather than have each computer application implement these from
scratch, they can rely on the DBMS to supply such calculations.

Change and access logging

Often one wants to know who accessed what attributes, what was changed, and when it was changed.
Logging services allow this by keeping a record of access occurrences and changes.

Automated optimization

If there are frequently occurring usage patterns or requests, some DBMS can adjust themselves to
improve the speed of those interactions. In some cases the DBMS will merely provide tools to monitor
performance, allowing a human expert to make the necessary adjustments after reviewing the statistics

Meta-data repository

Metadata (also spelled meta-data) is information about information. For example, a listing that describes
what attributes are allowed to be in data sets is called "meta-information".

Databases have been in use since the earliest days of electronic computing. Unlike modern systems
which can be applied to widely different databases and needs, the vast majority of older systems were
tightly linked to the custom databases in order to gain speed at the expense of flexibility. Originally
DBMSs were found only in large organizations with the computer hardware needed to support large data


What is a Database?
A database management system, or DBMS, gives the user access to their data and helps them transform
the data into information. Such database management systems include dBase, Paradox, IMS, and Oracle.
These systems allow users to create, update, and extract information from their databases. Compared to a
manual filing system, the biggest advantages to a computerized database system are speed, accuracy, and

A database is a structured collection of data. Data refers to the characteristics of people, things, and
events. Oracle stores each data item in its own field. For example, a person's first name, date of birth, and
their postal code are each stored in separate fields. The name of a field usually reflects its contents. A
postal code field might be named POSTAL-CODE or PSTL_CD. Each DBMS has its own rules for
naming the data fields.


relational database
- A relational database is a collection of data items organized as a set of formally-described
tables from which data can be accessed or reassembled in many different ways without having to
reorganize the database tables. The relational database was invented by E. F. Codd at IBM in 1970. The
standard user and application program interface to a relational database is the structured query language
(SQL). SQL statements are used both for interactive queries for information from a relational database
and for gathering data for reports. In addition to being relatively easy to create and access, a relational
database has the important advantage of being easy to extend. After the original database creation, a new
data category can be added without requiring that all existing applications be modified.

A relational database is a set of tables containing data fitted into predefined categories. Each table (which
is sometimes called a relation) contains one or more data categories in columns. Each row contains a
unique instance of data for the categories defined by the columns. For example, a typical business order
entry database would include a table that described a customer with columns for name, address, phone
number, and so forth. Another table would describe an order: product, customer, date, sales price, and so
forth. A user of the database could obtain a view of the database that fitted the user's needs. For example,
a branch office manager might like a view or report on all customers that had bought products after a
certain date. A financial services manager in the same company could, from the same tables, obtain a
report on accounts that needed to be paid.

When creating a relational database, you can define the domain of possible values in a data column and
further constraints that may apply to that data value. For example, a domain of possible customers could
allow up to ten possible customer names but be constrained in one table to allowing only three of these
customer names to be specifiable. The definition of a relational database results in a table of metadata or
formal descriptions of the tables, columns, domains, and constraints.

Database Models: Hierarcical, Network, Relational, Object-Oriented ...

Hierarchical Model
The hierarchical data model organizes data in a tree structure. There is a hierarchy of parent and child data
segments. This structure implies that a record can have repeating information, generally in the child data
segments. Data in a series of records, which have a set of field values attached to it. It collects all the instances
of a specific record together as a record type. These record types are the equivalent of tables in the relational
model, and with the individual records being the equivalent of rows. To create links between these record types,
the hierarchical model uses Parent Child Relationships. These are a 1:N mapping between record types. This is
done by using trees, like set theory used in the relational model, "borrowed" from maths. For example, an
organization might store information about an employee, such as name, employee number, department, salary.
The organization might also store information about an employee's children, such as name and date of birth. The
employee and children data forms a hierarchy, where the employee data represents the parent segment and the
children data represents the child segment. If an employee has three children, then there would be three child
segments associated with one employee segment. In a hierarchical database the parent-child relationship is one
to many. This restricts a child segment to having only one parent segment. Hierarchical DBMSs were popular
from the late 1960s, with the introduction of IBM's Information Management System (IMS) DBMS, through the

Network Model
The popularity of the network data model coincided with the popularity of the hierarchical data model. Some
data were more naturally modeled with more than one parent per child. So, the network model permitted the
modeling of many-to-many relationships in data. In 1971, the Conference on Data Systems Languages
(CODASYL) formally defined the network model. The basic data modeling construct in the network model is
the set construct. A set consists of an owner record type, a set name, and a member record type. A member
record type can have that role in more than one set, hence the multiparent concept is supported. An owner
record type can also be a member or owner in another set. The data model is a simple network, and link and
intersection record types (called junction records by IDMS) may exist, as well as sets between them . Thus, the
complete network of relationships is represented by several pairwise sets; in each set some (one) record type is
owner (at the tail of the network arrow) and one or more record types are members (at the head of the
relationship arrow). Usually, a set defines a 1:M relationship, although 1:1 is permitted. The CODASYL
network model is based on mathematical set theory.

Relational Model
(RDBMS - relational database management system) A database based on the relational model developed by
E.F. Codd. A relational database allows the definition of data structures, storage and retrieval operations and
integrity constraints. In such a database the data and relations between them are organised in tables. A table is a
collection of records and each record in a table contains the same fields.

Properties of Relational Tables:

Values Are Atomic
Each Row is Unique
Column Values Are of the Same Kind
The Sequence of Columns is Insignificant
The Sequence of Rows is Insignificant
Each Column Has a Unique Name

Certain fields may be designated as keys, which means that searches for specific values of that field will use
indexing to speed them up. Where fields in two different tables take values from the same set, a join operation
can be performed to select related records in the two tables by matching values in those fields. Often, but not
always, the fields will have the same name in both tables. For example, an "orders" table might contain
(customer-ID, product-code) pairs and a "products" table might contain (product-code, price) pairs so to
calculate a given customer's bill you would sum the prices of all products ordered by that customer by joining
on the product-code fields of the two tables. This can be extended to joining multiple tables on multiple fields.
Because these relationships are only specified at retreival time, relational databases are classed as dynamic
database management system. The RELATIONAL database model is based on the Relational Algebra.

Object/Relational Model
Object/relational database management systems (ORDBMSs) add new object storage capabilities to the
relational systems at the core of modern information systems. These new facilities integrate management of
traditional fielded data, complex objects such as time-series and geospatial data and diverse binary media such
as audio, video, images, and applets. By encapsulating methods with data structures, an ORDBMS server can
execute comple x analytical and data manipulation operations to search and transform multimedia and other
complex objects.

As an evolutionary technology, the object/relational (OR) approach has inherited the robust transaction- and
performance-management features of it s relational ancestor and the flexibility of its object-oriented cousin.
Database designers can work with familiar tabular structures and data definition languages (DDLs) while
assimilating new object-management possibi lities. Query and procedural languages and call interfaces in
ORDBMSs are familiar: SQL3, vendor procedural languages, and ODBC, JDBC, and proprie tary call
interfaces are all extensions of RDBMS languages and interfaces. And the leading vendors are, of course, quite
well known: IBM, Inform ix, and Oracle.

Object-Oriented Model
Object DBMSs add database functionality to object programming languages. They bring much more than
persistent storage of programming language objects. Object DBMSs extend the semantics of the C++, Smalltalk
and Java object programming languages to provide full-featured database programming capability, while
retaining native language compatibility. A major benefit of this approach is the unification of the application
and database development into a seamless data model and language environment. As a result, applications
require less code, use more natural data modeling, and code bases are easier to maintain. Object developers can
write complete database applications with a modest amount of additional effort.

According to Rao (1994), "The object-oriented database (OODB) paradigm is the combination of object-
oriented programming language (OOPL) systems and persistent systems. The power of the OODB comes from
the seamless treatment of both persistent data, as found in databases, and transient data, as found in executing

In contrast to a relational DBMS where a complex data structure must be flattened out to fit into tables or joined
together from those tables to form the in-memory structure, object DBMSs have no performance overhead to
store or retrieve a web or hierarchy of interrelated objects. This one-to-one mapping of object programming
language objects to database objects has two benefits over other storage approaches: it provides higher
performance management of objects, and it enables better management of the complex interrelationships
between objects. This makes object DBMSs better suited to support applications such as financial portfolio risk
analysis systems, telecommunications service applications, world wide web document structures, design and
manufacturing systems, and hospital patient record systems, which have complex relationships between data.

Semistructured Model
In semistructured data model, the information that is normally associated with a schema is contained within the
data, which is sometimes called ``self-describing''. In such database there is no clear separation between the data
and the schema, and the degree to which it is structured depends on the application. In some forms of
semistructured data there is no separate schema, in others it exists but only places loose constraints on the data.
Semi-structured data is naturally modelled in terms of graphs which contain labels which give semantics to its
underlying structure. Such databases subsume the modelling power of recent extensions of flat relational
databases, to nested databases which allow the nesting (or encapsulation) of entities, and to object databases
which, in addition, allow cyclic references between objects.

Semistructured data has recently emerged as an important topic of study for a variety of reasons. First, there are
data sources such as the Web, which we would like to treat as databases but which cannot be constrained by a
schema. Second, it may be desirable to have an extremely flexible format for data exchange between disparate
databases. Third, even when dealing with structured data, it may be helpful to view it as semistructured for the
purposes of browsing.

Associative Model
The associative model divides the real-world things about which data is to be recorded into two sorts:
Entities are things that have discrete, independent existence. An entity’s existence does not depend on any other
thing. Associations are things whose existence depends on one or more other things, such that if any of those
things ceases to exist, then the thing itself ceases to exist or becomes meaningless.
An associative database comprises two data structures:
1. A set of items, each of which has a unique identifier, a name and a type.
2. A set of links, each of which has a unique identifier, together with the unique identifiers of three other things,
that represent the source source, verb and target of a fact that is recorded about the source in the database. Each
of the three things identified by the source, verb and target may be either a link or an item.
For more information see: The Associative Model of Data

Entity-Attribute-Value (EAV) data model

The best way to understand the rationale of EAV design is to understand row modeling (of which EAV is a
generalized form). Consider a supermarket database that must manage thousands of products and brands, many
of which have a transitory existence. Here, it is intuitively obvious that product names should not be hard-coded
as names of columns in tables. Instead, one stores product descriptions in a Products table: purchases/sales of
individual items are recorded in other tables as separate rows with a product ID referencing this table.
Conceptually an EAV design involves a single table with three columns, an entity (such as an olfactory receptor
ID), an attribute (such as species, which is actually a pointer into the metadata table) and a value for the attribute
(e.g., rat). In EAV design, one row stores a single fact. In a conventional table that has one column per attribute,
by contrast, one row stores a set of facts. EAV design is appropriate when the number of parameters that
potentially apply to an entity is vastly more than those that actually apply to an individual entity.
For more information see: The EAV/CR Model of Data

Context Model
The context data model combines features of all the above models. It can be considered as a collection of
object-oriented, network and semistructured models or as some kind of object database. In other words this is a
flexible model, you can use any type of database structure depending on task. Such data model has been
implemented in DBMS ConteXt.

The fundamental unit of information storage of ConteXt is a CLASS. Class contains METHODS and describes
OBJECT. The Object contains FIELDS and PROPERTY. The field may be composite, in this case the field
contains SubFields etc. The property is a set of fields that belongs to particular Object. (similar to AVL
database). In other words, fields are permanent part of Object but Property is its variable part.
The header of Class contains the definition of the internal structure of the Object, which includes the description
of each field, such as their type, length, attributes and name. Context data model has a set of predefined types as
well as user defined types. The predefined types include not only character strings, texts and digits but also
pointers (references) and aggregate types (structures).
A context model comprises three main data types: REGULAR, VIRTUAL and REFERENCE. A regular (local)
field can be ATOMIC or COMPOSITE. The atomic field has no inner structure. In contrast, a composite field
may have a complex structure, and its type is described in the header of Class. The composite fields are divided
into STATIC and DYNAMIC. The type of a static composite field is stored in the header and is permanent.
Description of the type of a dynamic composite field is stored within the Object and can vary from Object to

Like a NETWORK database, apart from the fields containing the information directly, context database has
fields storing a place where this information can be found, i.e. POINTER (link, reference) which can point to an
Object in this or another Class. Because main addressed unit of context database is an Object, the pointer is
made to Object instead of a field of this Object. The pointers are divided on STATIC and DYNAMIC. All
pointers that belong to a particular static pointer type point to the same Class (albeit, possibly, to different
Object). In this case, the Class name is an integral part of the that pointer type. A dynamic pointer type
describes pointers that may refer to different Classes. The Class, which may be linked through a pointer, can
reside on the same or any other computer on the local area network. There is no hierarchy between Classes and
the pointer can link to any Class, including its own.

In contrast to pure object-oriented databases, context databases is not so coupled to the programming language
and doesn't support methods directly. Instead, method invocation is partially supported through the concept of
VIRTUAL fields.

A VIRTUAL field is like a regular field: it can be read or written into. However, this field is not physically
stored in the database, and in it does not have a type described in the scheme. A read operation on a virtual field
is intercepted by the DBMS, which invokes a method associated with the field and the result produced by that
method is returned. If no method is defined for the virtual field, the field will be blank. The METHODS is a
subroutine written in C++ by an application programmer. Similarly, a write operation on a virtual field invokes
an appropriate method, which can changes the value of the field. The current value of virtual fields is
maintained by a run-time process; it is not preserved between sessions. In object-oriented terms, virtual fields
represent just two public methods: reading and writing. Experience shows, however, that this is often enough in
practical applications. From the DBMS point of view, virtual fields provide transparent interface to such
methods via an aplication written by application programer.

A context database that does not have composite or pointer fields and property is essentially RELATIONAL.
With static composite and pointer fields, context database become OBJECT-ORIENTED. If the context
database has only Property in this case it is an ENTITY-ATTRIBUTE-VALUE database. With dynamic
composite fields, a context database becomes what is now known as a SEMISTRUCTURED database. If the
database has all available types... in this case it is ConteXt database!
Database - database models
A database model is a theory or specification describing how a database is structured and used. Several
such models have been suggested.
Common models include:

• Hierarchical model
• Network model
• Relational model
• Entity-relationship
• Object-Relational model
• Object model

Other models include:

• Associative
• Concept-oriented
• Entity-Attribute-Value
• Multi-dimensional model
• Semi-structured
• Star schema
• XML database

A computer database is a structured collection of records or data that is stored in a computer
system so that a computer program or person using a query language can consult it to answer
queries. The records retrieved in answer to queries are information that can be used to make
decisions. The computer program used to manage and query a database is known as a database
management system (DBMS). The properties and design of database systems are included in the
study of information science.

A typical query could be to answer questions such as, "How many hamburgers with two or more
beef patties were sold in the month of March in New Jersey?". To answer such a question, the
database would have to store information about hamburgers sold, including number of patties, sales
date, and the region The term "database" originated within the computing discipline. Although its
meaning has been broadened by popular use, even to include non-electronic databases, this article is
about computer databases. Database-like collections of information existed well before the
Industrial Revolution in the form of ledgers, sales receipts and other business-related collections of

The central concept of a database is that of a collection of records, or pieces of information.

Typically, for a given database, there is a structural description of the type of facts held in that
database: this description is known as a schema. The schema describes the objects that are
represented in the database, and the relationships among them. There are a number of different
ways of organizing a schema, that is, of modelling the database structure: these are known as
database models (or data models). The model in most common use today is the relational model,
which in layman's terms represents all information in the form of multiple related tables each
consisting of rows and columns (the true definition uses mathematical terminology). This model
represents relationships by the use of values common to more than one table. Other models such as
the hierarchical model and the network model use a more explicit representation of relationships.
The term database refers to the collection of related records, and the software should be referred to
as the database management system or DBMS. When the context is unambiguous, however, many
database administrators and programmers use the term database to cover both meanings.

Many professionals consider a collection of data to constitute a database only if it has certain
properties: for example, if the data is managed to ensure its integrity and quality, if it allows shared
access by a community of users, if it has a schema, or if it supports a query language. However,
there is no definition of these properties that is universally agreed upon.

Database management systems are usually categorized according to the data model that they
support: relational, object-relational, network, and so on. The data model will tend to determine the
query languages that are available to access the database. A great deal of the internal engineering of
a DBMS, however, is independent of the data model, and is concerned with managing factors such
as performance, concurrency, integrity, and recovery from hardware failures. In these areas there
are large differences between products.

Database models
Various techniques are used to model data structure.

Most database systems are built around one particular data model, although it is increasingly
common for products to offer support for more than one model. For any one logical model various
physical implementations may be possible, and most products will offer the user some level of
control in tuning the physical implementation, since the choices that are made have a significant
effect on performance. An example is the relational model: all serious implementations of the
relational model allow the creation of indexes which provide fast access to rows in a table if the
values of certain columns are known.

Flat model

The flat (or table) model consists of a single, two-dimensional array of data elements, where all
members of a given column are assumed to be similar values, and all members of a row are
assumed to be related to one another.

Hierarchical model

In a hierarchical model, data is organized into a tree-like structure, implying a single upward link in
each record to describe the nesting, and a sort field to keep the records in a particular order in each
same-level list.

Network model

The network model tends to store records with links to other records. Associations are tracked via
"pointers". These pointers can be node numbers or disk addresses. Most network databases tend to
also include some form of hierarchical model.

Relational model

Three key terms are used extensively in relational database models: relations, attributes, and
domains. A relation is a table with columns and rows. The named columns of the relation are called
attributes, and the domain is the set of values the attributes are allowed to take.

The basic data structure of the relational model is the table, where information about a particular
entity (say, an employee) is represented in columns and rows (also called tuples). Thus, the
"relation" in "relational database" refers to the various tables in the database; a relation is a set of
tuples. The columns enumerate the various attributes of the entity (the employee's name, address or
phone number, for example), and a row is an actual instance of the entity (a specific employee) that
is represented by the relation. As a result, each tuple of the employee table represents various
attributes of a single employee.

All relations (and, thus, tables) in a relational database have to adhere to some basic rules to qualify
as relations. First, the ordering of columns is immaterial in a table. Second, there can't be identical
tuples or rows in a table. And third, each tuple will contain a single value for each of its attributes.

A relational database contains multiple tables, each similar to the one in the "flat" database model.
One of the strengths of the relational model is that, in principle, any value occurring in two
different records (belonging to the same table or to different tables), implies a relationship among
those two records. Yet, in order to enforce explicit integrity constraints, relationships between
records in tables can also be defined explicitly, by identifying or non-identifying parent-child
relationships characterized by assigning cardinality (1:1, (0)1:M, M:M). Tables can also have a
designated single attribute or a set of attributes that can act as a "key", which can be used to
uniquely identify each tuple in the table.

A key that can be used to uniquely identify a row in a table is called a primary key. Keys are
commonly used to join or combine data from two or more tables. For example, an Employee table
may contain a column named Location which contains a value that matches the key of a Location
table. Keys are also critical in the creation of indices, which facilitate fast retrieval of data from
large tables. Any column can be a key, or multiple columns can be grouped together into a
compound key. It is not necessary to define all the keys in advance; a column can be used as a key
even if it was not originally intended to be one.

Relational operations

Users (or programs) request data from a relational database by sending it a query that is written in a
special language, usually a dialect of SQL. Although SQL was originally intended for end-users, it
is much more common for SQL queries to be embedded into software that provides an easier user
interface. Many web sites, such as Wikipedia, perform SQL queries when generating pages.

In response to a query, the database returns a result set, which is just a list of rows containing the
answers. The simplest query is just to return all the rows from a table, but more often, the rows are
filtered in some way to return just the answer wanted. Often, data from multiple tables are
combined into one, by doing a join. There are a number of relational operations in addition to join.

Normal forms

Main article: Database normalization

Relations are classified based upon the types of anomalies to which they're vulnerable. A database
that's in the first normal form is vulnerable to all types of anomalies, while a database that's in the
domain/key normal form has no modification anomalies. Normal forms are hierarchical in nature.
That is, the lowest level is the first normal form, and the database cannot meet the requirements for
higher level normal forms without first having met all the requirements of the lesser normal form.

Object database models

In recent years, the object-oriented paradigm has been applied to database technology, creating a
new programming model known as object databases. These databases attempt to bring the database
world and the application programming world closer together, in particular by ensuring that the
database uses the same type system as the application program. This aims to avoid the overhead
(sometimes referred to as the impedance mismatch) of converting information between its
representation in the database (for example as rows in tables) and its representation in the
application program (typically as objects). At the same time, object databases attempt to introduce
the key ideas of object programming, such as encapsulation and polymorphism, into the world of

A variety of these ways have been tried for storing objects in a database. Some products have
approached the problem from the application programming end, by making the objects manipulated
by the program persistent. This also typically requires the addition of some kind of query language,
since conventional programming languages do not have the ability to find objects based on their
information content. Others have attacked the problem from the database end, by defining an
object-oriented data model for the database, and defining a database programming language that
allows full programming capabilities as well as traditional query facilities.

Post-relational database models

Several products have been identified as post-relational because the data model incorporates
relations but is not constrained by the Information Principle, requiring that all information is
represented by data values in relations. Products using a post-relational data model typically
employ a model that actually pre-dates the relational model. These might be identified as a directed
graph with trees on the nodes.

Examples of models that could be classified as post-relational are PICK aka MultiValue, and

Fuzzy databases

It is possible to develop fuzzy relational databases. Basically, a fuzzy database is a database using
fuzzy logic, for example with fuzzy attributes, which may be defined as attributes of a item, row or
object in a database, which allow to store fuzzy information (imprecise or uncertain data). There are
many forms of adding flexibility in fuzzy databases. The simplest technique is to add a fuzzy
membership degree to each record, i.e. an attribute in the range [0,1]. However, there are other
kinds of databases allowing fuzzy values to be stored in fuzzy attributes using fuzzy sets (including
fuzzy spatial datatypes), possibility distributions or fuzzy degrees associated to some attributes and
with different meanings (membership degree, importance degree, fulfillment degree...). Sometimes,
the expression “fuzzy databases” is used for classical databases with fuzzy queries or with other
fuzzy aspects, such as constraints.

The first fuzzy relational database, FRDB, appeared in Maria Zemankova's dissertation. After,
some other models arose like the Buckles-Petry model, the Prade-Testemale Model, the Umano-
Fukami model or the GEFRED model by J.M. Medina, M.A. Vila et al. In the context of fuzzy
databases, some fuzzy querying languages have been defined, highlighting the SQLf by P. Bosc et
al. and the FSQL by J. Galindo et al. These languages define some structures in order to include
fuzzy aspects in the SQL statements, like fuzzy conditions, fuzzy comparators, fuzzy constants,
fuzzy constraints, fuzzy thresholds, linguistic labels and so on.


We shall be illustrating the theory and principles of relation database systems with reference to the
Oracle proprietary Relational Database Management Systems (RDBMS) and Structured Query
Language (SQL), the most commonly used database language used to manipulate the structure of
the database and data held within it
The relational model.
The relational model consists of three components:

1. A Structural component -- a set of TABLES (also called RELATIONS).

2. MANIPULATIVE component consisting of a set of high-level operations which act upon
and produce whole tables.
3. A SET OF RULES for maintaining the INTEGRITY of the database.

The terminology associated with relational database theory originates from the branch of
mathematics called set theory although there are widely used synonyms for these precise,
mathematical terms.

• Data structures are composed of two components which represent a model of the situation
being considered. These are (i) ENTITY TYPES - i.e. data group types, and (ii) the
RELATIONSHIPS between the entity types.
• Entity types are represented by RELATIONS or BASE TABLES. These two terms are
interchangeable - a RELATION is the mathematical term for a TABLE.
• A base table is loosely defined as an un-ordered collection of zero, one or more TUPLES
(ROWS) each of which consists of one or more un-ordered ATTRIBUTES (COLUMNS).
All tuples are made up of exactly the same set of attributes. For the remainder of this
discussion we shall use the more widely known terminology:
• Each column is drawn from a DOMAIN, that is, a set of values from which the actual
values are taken (e.g. a set of car model names). More than one column in a table may draw
its values from the same domain.
• A column entry in any row is SINGLE-VALUED, i.e. it contains exactly one item only (e.g.
a surname). Repeating groups, i.e. columns which contain sets of values rather than a single
value, not allowed.
• Each row of a table is uniquely identified by a PRIMARY KEY composed of one or more
columns. This implies that a table may not contain duplicate rows.
• Note that, in general, a column, or group of columns, that uniquely identifies a row in a
table is called a CANDIDATE KEY. There may be more than one candidate key for a
particular table; one of these will be chosen as the primary key.
• The ENTITY INTEGRITY RULE of the model states that no component of the primary key
may contain a NULL value.
• A column, or combination of columns, that matches the primary key of another table is
called a FOREIGN KEY.
• The REFERENTIAL INTEGRITY RULE of the model states that, for every foreign key
value in a table there must be a corresponding primary key value in another table in the
• Only two kinds of table may be defined in a SQL schema; BASE TABLES and VIEWS.
These are called NAMED RELATIONS. Other tables, UNNAMED RELATIONS, may be
derived from these by means of relational operations such as JOINS and PROJECTIONS.
• All tables are LOGICAL ENTITIES. Of these only base tables physically exist in that there
exist physically stored records, and possible physical access paths such as indexes, in one or
more stored files, that directly support the table in physical storage. Although standard
techniques such as HASHING, INDEXING, etc. will be used for implementation efficiency,
the user of the data base should require no knowledge of previously defined access paths.
• Views and the results of all operations on tables - unnamed relations - are tables that exist as
LOGICAL DEFINITIONS, in terms of a view definition, or a [SELECT .. FROM ..
WHERE .. ORDER BY] sequence.
• The term UPDATE has two meanings:
o as a SQL operation in its own right which causes one or more columns in a table to
be altered; in this context it will always be shown in upper-case letters - UPDATE.

o as a generic term used to include the SQL operations INSERT, DELETE and
UPDATE; in this context it will always be shown in lower-case letters.
Network Topologies
topology -
Lesson #3 - Click on it

Topology refers to the shape of a network, or the network's layout. How different nodes in a network are
connected to each other and how they communicate are determined by the network's topology.

Topologies are either physical or logical. Below are diagrams of the five most common network topologies.

Physical Topology when in the context of networking refers to the physical layout of
the devices connected to the network, including the location and cable installation.

Logical Topology refers to the way it actually operates (transfers data) as opposed
to its layout.
Network topologies are categorized into the following basic types:

• Star Topology
• Ring Topology
• Bus Topology
• Tree Topology
• Mesh Topology
• Hybrid Topology

More complex networks can be built as hybrids of two or more of the above basic

Mesh Topology Devices are connected

with many redundant interconnections
between network nodes. In a true mesh
topology every node has a connection to
every other node in the network.

Star Topology All devices are connected

to a central hub. Nodes communicate across
the network by passing data through the hub.
Bus Topology All devices are connected to
a central cable, called the bus or backbone.

Ring Topology All devices are connected

to one another in the shape of a closed loop,
so that each device is connected directly to
two other devices, one on either side of it.

Tree Topology A hybrid topology. Groups

of star-configured networks are connected to
a linear bus backbone.


An Overview of Computer Network Topology

In Computer Networking “topology” refers to the layout or design of the connected devices. Network Topologies can be physical
or logical.

Physical Topology means the physical design of a network including the devices, location and cable installation.

Logical Topology refers to the fact that how data actually transfers in a network as opposed to its design.

Topologies are either physical or logical. Physical topologies deal with how the devices on a network are wired together. Logical
topology deals with how information is passed from one device on a network to another.
Computer network topologies can be categorized in the following categories.
• bus
• star
• ring
• mesh
• Tree.
Hybrid networks are the complex networks, which can be built of two or more above mentioned topologies.
Bus Topology
Bus topology uses a common backbone to connect all the network devices in a network in a linear shape. A single cable functions
as the shared communication medium for all the devices attached with this cable with an interface connector. The device, which
wants to communicate send the broadcast message to all the devices attached with the shared cable but only the intended recipient
actually accepts and process that message.

Bus network works with very limited devices. Performance issues are likely to occur in the Bus topology if more than 12-15
computers are added in a Bus Network. Additionally, if the Backbone cable fails then all network becomes useless and no
communication fails among all the computers.

Ring Topology
In ring Network, every computer or devices has two adjacent neighbors for communication. In a ring network, all the
communication messages travel in the same directory whether clockwise or anti clockwise. Any damage of the cable of any cable
or device can result in the breakdown of the whole network. Ring topology now has become almost obsolete.
FDDI, SONET or Token Ring Technology can be used to implement Ring Technology. Ring topologies can be found in office,
school or small buildings.

Star Topology

In the computer networking world the most commonly used topology in LAN is the star topology. Star topologies can be
implemented in home, offices or even in a building. All the computers in the star topologies are connected to central devices like
hub, switch or router. The functionality of all these devices is different.

Computers in a network are usually connected with the hub, switch or router with the Unshielded Twisted Pair (UTP) or Shielded
Twisted Pair Cables.

As compared to the bus topology, a star network requires more devices & cables to complete anetwork. The failure of each node
or cable in a star network, won’t take down the entire network

However if the central connecting devices such as hub, switch or router fails due to any reason,then ultimately all the network can
come down or collapse.

Tree Topology

Tree topologies are comprised of the multiple star topologies on a bus. Tree topologies integrate multiple star topologies together
onto a bus. Only the hub devices can connect directly with the tree bus and each Hub functions as a root of a tree of the network
devices. This bus/star/hybrid combination supports future expandability of the computer networks, much better than a bus or star.

Mesh Topology

Mesh topology work on the concept of routes. In Mesh topology, message sent to the destination can take any possible shortest,
easiest route to reach its destination. In the previous topologies star and bus, messages are usually broadcasted to every computer,
especially in bus topology. Similarly in the Ring topology message can travel in only one direction i.e clockwise or anticlockwise.
Internet employs the Mesh topology and the message finds its route for its destination. Router works in find the routes for the
messages and in reaching them to their destinations.The topology in which every devices connects to every other device is called a
full Mesh topology unlike in the partial mesh in which every device is indirectly connected to the other devices.


What is Network Topology?

The physical topology of a network refers to the configuration of cables, computers, and other
peripherals. Physical topology should not be confused with logical topology which is the method used to
pass information between workstations. Logical topology was discussed in the Protocol chapter.

Main Types of Network Topologies

In networking, the term "topology" refers to the layout of connected devices on a network. This article
introduces the standard topologies of computer networking.

One can think of a topology as a network's virtual shape or structure. This shape does not necessarily
correspond to the actual physical layout of the devices on the network. For example, the computers on
a home LAN may be arranged in a circle in a family room, but it would be highly unlikely to find an
actual ring topology there.

Network topologies are categorized into the following basic types:

Star Topology
Ring Topology
Bus Topology
Tree Topology
Mesh Topology
Hybrid Topology

More complex networks can be built as hybrids of two or more of the above basic topologies.

Star Topology

Many home networks use the star topology. A star network features a central connection point called a
"hub" that may be a hub, switch or router. Devices typically connect to the hub with Unshielded
Twisted Pair (UTP) Ethernet.

Compared to the bus topology, a star network generally requires more cable, but a failure in any star
network cable will only take down one computer's network access and not the entire LAN. (If the hub
fails, however, the entire network also fails.)

See the illustration of Star Network Topology.

Advantages of a Star Topology

Easy to install and wire.

No disruptions to the network then connecting or removing devices.
Easy to detect faults and to remove parts.

Disadvantages of a Star Topology

Requires more cable length than a linear topology.

If the hub or concentrator fails, nodes attached are disabled.
More expensive than linear bus topologies because of the cost of the concentrators.

The protocols used with star configurations are usually Ethernet or LocalTalk. Token Ring uses a similar topol
the star-wired ring.

Star-Wired Ring

A star-wired ring topology may appear (externally) to be the same as a star topology. Internally, the MAU of a
ring contains wiring that allows information to pass from one device to another in a circle or ring (See fig. 3).
Ring protocol uses a star-wired ring topology.
Ring Topology

In a ring network, every device has exactly two neighbors for communication purposes. All messages travel
ring in the same direction (either "clockwise" or "counterclockwise"). A failure in any cable or device breaks th
can take down the entire network.

To implement a ring network, one typically uses FDDI, SONET, or Token Ring technology. Ring topologies ar
some office buildings or school campuses.

See the illustration of Ring Topology.

Bus Topology

Bus networks (not to be confused with the system bus of a computer) use a common backbone to
connect all devices. A single cable, the backbone functions as a shared communication medium that
devices attach or tap into with an interface connector. A device wanting to communicate with another
device on the network sends a broadcast message onto the wire that all other devices see, but only the
intended recipient actually accepts and processes the message.

Ethernet bus topologies are relatively easy to install and don't require much cabling compared to the
alternatives. 10Base-2 ("ThinNet") and 10Base-5 ("ThickNet") both were popular Ethernet cabling
options many years ago for bus topologies. However, bus networks work best with a limited number of
devices. If more than a few dozen computers are added to a network bus, performance problems will
likely result. In addition, if the backbone cable fails, the entire network effectively becomes unusable.

See the illustration of Bus Network Topology.

Advantages of a Linear Bus Topology

Easy to connect a computer or peripheral to a linear bus.
Requires less cable length than a star topology.

Disadvantages of a Linear Bus Topology

Entire network shuts down if there is a break in the main cable.

Terminators are required at both ends of the backbone cable.
Difficult to identify the problem if the entire network shuts down.
Not meant to be used as a stand-alone solution in a large building.

Tree Topology

Tree topologies integrate multiple star topologies together onto a bus. In its simplest form, only hub
devices connect directly to the tree bus, and each hub functions as the "root" of a tree of devices. This
bus/star hybrid approach supports future expandability of the network much better than a bus (limited
in the number of devices due to the broadcast traffic it generates) or a star (limited by the number of
hub connection points) alone.

See the illustration of Tree Network Topology.

Advantages of a Tree Topology

Point-to-point wiring for individual segments.

Supported by several hardware and software venders.

Disadvantages of a Tree Topology

Overall length of each segment is limited by the type of cabling used.

If the backbone line breaks, the entire segment goes down.
More difficult to configure and wire than other topologies.

Mesh Topology
Mesh topologies involve the concept of routes. Unlike each of the previous topologies, messages sent on a mesh network can take
any of several possible paths from source to destination. (Recall that even in a ring, although two cable paths exist, messages can
only travel in one direction.) Some WANs, most notably the Internet, employ mesh routing.

A mesh network in which every device connects to every other is called a full mesh. As shown in the
illustration below, partial mesh networks also exist in which some devices connect only indirectly to

See the illustration of Mesh Network Topology.

Hybrid Topology

A combination of any two or more network topologies. Note 1: Instances can occur where two basic
network topologies, when connected together, can still retain the basic network character, and
therefore not be a hybrid network. For example, a tree network connected to a tree network is still a
tree network. Therefore, a hybrid network accrues only when two basic networks are connected and
the resulting network topology fails to meet one of the basic topology definitions. For example, two
star networks connected together exhibit hybrid network topologies. Note 2: A hybrid topology always
accrues when two different basic network topologies are connected.

5-4-3 Rule

A consideration in setting up a tree topology using Ethernet protocol is the 5-4-3 rule. One aspect of
the Ethernet protocol requires that a signal sent out on the network cable reach every part of the
network within a specified length of time. Each concentrator or repeater that a signal goes through
adds a small amount of time. This leads to the rule that between any two nodes on the network there
can only be a maximum of 5 segments, connected through 4 repeaters/concentrators. In addition, only
3 of the segments may be populated (trunk) segments if they are made of coaxial cable. A populated
segment is one which has one or more nodes attached to it . In Figure 4, the 5-4-3 rule is adhered to.
The furthest two nodes on the network have 4 segments and 3 repeaters/concentrators between them.

This rule does not apply to other network protocols or Ethernet networks where all fiber optic cabling
or a combination of a fiber backbone with UTP cabling is used. If there is a combination of fiber optic
backbone and UTP cabling, the rule is simply translated to 7-6-5 rule.

Considerations When Choosing a Topology:

Money. A linear bus network may be the least expensive way to install a network; you do not have to
purchase concentrators.
Length of cable needed. The linear bus network uses shorter lengths of cable.
Future growth. With a star topology, expanding a network is easily done by adding another
Cable type. The most common cable in schools is unshielded twisted pair, which is most often used
with star topologies.

Other definition of Network Topology

A network consists of multiple computers connected using some type of interface, each having one or
more interface devices such as a Network Interface Card (NIC) and/or a serial device for PPP
networking. Each computer is supported by network software that provides the server or client
functionality. The hardware used to transmit data across the network is called the media. It may
include copper cable, fiber optic, or wireless transmission. The standard cabling used for the purposes
of this document is 10Base-T category 5 Ethernet cable. This is twisted copper cabling which appears
at the surface to look similar to TV coaxial cable. It is terminated on each end by a connector that
looks much like a phone connector. Its maximum segment length is 100 meters.

In a server based network, there are computers set up to be primary providers of services such as
file service or mail service. The computers providing the service are are called servers and the
computers that request and use the service are called client computers.

In a peer-to-peer network, various computers on the network can act both as clients and servers.
For instance, many Microsoft Windows based computers will allow file and print sharing. These
computers can act both as a client and a server and are also referred to as peers. Many networks are
combination peer-to-peer and server based networks. The network operating system uses a network
data protocol to communicate on the network to other computers. The network operating system
supports the applications on that computer. A Network Operating System (NOS) includes Windows NT,
Novell Netware, Linux, Unix and others.

L A N Protocol
Introduction http://www.inetdaemon.com/tutorials/lan/define_network.shtml

What Is a LAN?
A LAN is a high-speed data network that covers a relatively small geographic area. It typically connects workstations,
personal computers, printers, servers, and other devices. LANs offer computer users many advantages, including shared
access to devices and applications, file exchange between connected users, and communication between users via electronic
mail and other applications.

LAN Topologies
LAN topologies define the manner in which network devices are organized. Four common LAN topologies exist: bus, ring,
star, and tree. These topologies are logical architectures, but the actual devices need not be physically organized in these
configurations. Logical bus and ring topologies, for example, are commonly organized physically as a star.

Of the three, most widely used LAN implementations, Ethernet/IEEE 802.3 networks—including 100BaseT—implement a
bus topology.

A bus topology is a linear LAN architecture in which transmissions from network stations propagate the length of the
medium and are received by all other stations.

Figure 2-3 Some Networks Implement a Local Bus Topology

A ring topology is a LAN architecture that consists of a series of devices connected to one another by unidirectional
transmission links to form a single closed loop. Both Token Ring/IEEE 802.5 and FDDI networks implement a ring topology.
Figure 2-4 depicts a logical ring topology.

Figure 2-4 Some Networks Implement a Logical Ring Topology

A star topology is a LAN architecture in which the endpoints on a network are connected to a common central hub, or
switch, by dedicated links. Logical bus and ring topologies are often implemented physically in a star topology, which is
illustrated in Figure 2-5.

A tree topology is a LAN architecture that is identical to the bus topology, except that branches with multiple nodes are
possible in this case. Figure 2-5 illustrates a logical tree topology.
Figure 2-5 A Logical Tree Topology Can Contain Multiple Nodes

LAN Protocols and the OSI Reference Model

LAN protocols function at the lowest two layers of the OSI reference model, as discussed in Chapter 1, "Internetworking
Basics," between the physical layer and the data link layer. Figure 2-2 illustrates how several popular LAN protocols map to
the OSI reference model.

Figure 2-2 Popular LAN Protocols Mapped to the OSI Reference Model

LAN Media-Access Methods

Media contention occurs when two or more network devices have data to send at the same time. Because multiple devices
cannot talk on the network simultaneously, some type of method must be used to allow one device access to the network
media at a time. This is done in two main ways: carrier sense multiple access collision detect (CSMA/CD) and token

In networks using CSMA/CD technology such as Ethernet, network devices contend for the network media. When a device
has data to send, it first listens to see if any other device is currently using the network. If not, it starts sending its data.
After finishing its transmission, it listens again to see if a collision occurred. A collision occurs when two devices send
data simultaneously. When a collision happens, each device waits a random length of time before resending its data. In
most cases, a collision will not occur again between the two devices. Because of this type of network contention, the busier
a network becomes, the more collisions occur. This is why performance of Ethernet degrades rapidly as the number of
devices on a single network increases.

In token-passing networks such as Token Ring and FDDI, a special network frame called a token is passed around the
network from device to device. When a device has data to send, it must wait until it has the token and then sends its data.
When the data transmission is complete, the token is released so that other devices may use the network media. The main
advantage of token-passing networks is that they are deterministic. In other words, it is easy to calculate the maximum
time that will pass before a device has the opportunity to send data. This explains the popularity of token-passing networks
in some real-time environments such as factories, where machinery must be capable of communicating at a determinable

For CSMA/CD networks, switches segment the network into multiple collision domains. This reduces the number of
devices per network segment that must contend for the media. By creating smaller collision domains, the performance of a
network can be increased significantly without requiring addressing changes.

Normally CSMA/CD networks are half-duplex, meaning that while a device sends information, it cannot receive at the
time. While that device is talking, it is incapable of also listening for other traffic. This is much like a walkie-talkie. When
one person wants to talk, he presses the transmit button and begins speaking. While he is talking, no one else on the same
frequency can talk. When the sending person is finished, he releases the transmit button and the frequency is available to

When switches are introduced, full-duplex operation is possible. Full-duplex works much like a telephone—you can listen
as well as talk at the same time. When a network device is attached directly to the port of a network switch, the two
devices may be capable of operating in full-duplex mode. In full-duplex mode, performance can be increased, but
not quite as much as some like to claim. A 100-Mbps Ethernet segment is capable of transmitting 200 Mbps of data, but
only 100 Mbps can travel in one direction at a time. Because most data connections are asymmetric (with more data
traveling in one direction than the other), the gain is not as great as many claim. However, full-duplex operation does
increase the throughput of most applications because the network media is no longer shared. Two devices on a full-duplex
connection can send data as soon as it is ready.

Token-passing networks such as Token Ring can also benefit from network switches. In large networks, the delay between
turns to transmit may be significant because the token is passed around the network.

LAN Transmission Methods

LAN data transmissions fall into three classifications: unicast, multicast, and broadcast.
In each type of transmission, a single packet is sent to one or more nodes.

In a unicast transmission, a single packet is sent from the source to a destination on a network. First, the source node
addresses the packet by using the address of the destination node. The package is then sent onto the network, and finally,
the network passes the packet to its destination.

A multicast transmissionconsists of a single data packet that is copied and sent to a specific subset of nodes on the
network. First, the source node addresses the packet by using a multicast address. The packet is then sent into the network,
which makes copies of the packet and sends a copy to each node that is part of the multicast address.

A broadcast transmission consists of a single data packet that is copied and sent to all nodes on the network. In these
types of transmissions, the source node addresses the packet by using the broadcast address. The packet is then sent on to
the network, which makes copies of the packet and sends a copy to every node on the network.

"LAN Protocols," address specific protocols in more detail. Figure 2-1 illustrates the basic layout of
these three implementations. Figure 2-1 Three LAN Implementations Are Used Most Commonly
Types of Flip - Flop Circuits
"Flip-flop" is the common name given to two-state devices which offer basic memory for
sequential logic operations. Flip-flops are heavily used for digital data storage and transfer and
are commonly used in banks called "registers" for the storage of binary numerical data.

The set/reset type flip-flop is triggered to a high
state at Q by the "set" signal and holds that value
until reset to low by a signal at the Reset input. This
can be implemented as a NAND gate latch or a
NOR gate latch and as a clocked version.
One disadvantage of the S/R flip-flop is that the
input S=R=0 gives ambiguous results and must be
avoided. The J-K flip-flop gets around that problem.

=============================================================== NAND-gate Latch
============================================================= NOR-gate Latch

The time sequence at right shows the conditions under which the set and reset inputs cause a state change,
and when they don't.

J-K Flip-Flop

The J-K flip-flop is the most versatile of the basic flip-flops. It has the input- following
character of the clocked D flip-flop but has two inputs,traditionally labeled J and K. If J
and K are different then the output Q takes the value of J at the next clock edge.
Examine Structure Applications

If J and K are both low then no change occurs. If J and K are both high at the clock edge then the output will
toggle from one state to the other. It can perform the functions of the set/reset flip-flop and has the
advantage that there are no ambiguous states. It can also act as a T flip-flop to accomplish toggling action if
J and K are tied together. This toggle application finds extensive use in binary counters.
Switching Example: J-K Flip-Flop

The positive going transition (PGT) of the clock enables the switching of the
output Q. The "enable" condition does not persist through the entire positive
phase of the clock. The J & K inputs alone cannot cause a transition, but their
values at the time of the PGT determine the output according to the truth table.

The D Flip-Flop

The D flip-flop tracks the input, making transitions with match those of the input D. The
D stands for "data"; this flip-flop stores the value that is on the data line. It can be
thought of as a basic memory cell. A D flip-flop can be made from a set/reset flip-flop
by tying the set to the reset through an inverter. The result may be clocked.
=========================================================== D Flip-Flop from
NAND Latch The output Q will track the input D so long as the flip-flop remains enabled.

=========================================================== Clocked D Flip-Flop A

D flip-flop constructed from a NAND-latch .

Output Example

========================================================== The T Flip-Flop

The T or "toggle" flip-flop changes its output on

each clock edge, giving an output which is half the
frequency of the signal to the T input.

It is useful for constructing binary counters, frequency dividers, and general binary addition devices. It can
be made from a J-K flip-flop by tying both of its inputs high.


The basic building blocks of a computer are called logical gates or just gates.
input and exactly one output. Input and output values are
Gates are basic circuits that have at least one (and usually more)
the logical values true and false. In computer architecture it is common to use 0 for false and 1 for true. Gates
have no memory. The value of the output depends only on the current value of the inputs. This fact makes it
possible to use a truth table to fully describe the behavior of a gate.

We usually consider three basic kinds of gates, and-gates, or-gates, and not-gates (or inverters).

Basic gates
The and-gate
An and-gate has an arbitrary number of inputs. The output value is 1 if and only if all of the inputs are 1.
Otherwise the output value is 0. The name has been chosen because the output is 1 if and only if the first input
and the second input, and, ..., and the nth input are all 1.

It is often useful to draw diagrams of gates and their interconnections. In such diagrams, the and-gate is drawn like

The truth table for an and-gate with two inputs looks like this:
0 0 | 0
0 1 | 0
1 0 | 0
1 1 | 1

The or-gate
Like the and-gate, the or-gate can have an arbitrary number of inputs. The output value is 1 if and only of at least
one of the input values are 1. Otherwise the output is 0. In other words, the output value is 0 only if all inputs
are 0. The name has been chosen because the output is 1 if and only if the first input or the second input, or, ...,
or the nth input is 1.

In circuit diagrams, we draw the or-gate like this:

The truth table for an or-gate with two inputs looks like this:
0 0 | 0
0 1 | 1
1 0 | 1
1 1 | 1

The inverter
An inverter has exactly one input and one output. The value of the output is 1 if and only if the input is 0.
Otherwise, the output is 0. In other words, the value of the output is the exact opposite of the value of the input.
In circuit diagrams, we draw the inverter like this:

The truth table for an inverter looks like this:


Combined gates
Sometimes, it is practical to combine functions of the basic gates into more complex gates, for instance in order to save space in circuit
diagrams. In this section, we show some such combined gates together with their truth tables.

The nand-gate
The nand-gate is an and-gate with an inverter on the output. So instead of drawing several gates like this:

The truth table for the nand-gate is like the one for the and-gate, except that all output values have been inverted:


The nor-gate
The nor-gate is an or-gate with an inverter on the output. So instead of drawing several gates like this:

We draw a single or-gate with a little ring on the output like this:
The nor-gate, like the or-gate can take an arbitrary number of inputs. The
truth table for the nor-gate is like the one for the or-gate, except that all output values have been inverted:

The exclusive-or-gate
The exclusive-or-gate is similar to an or-gate. It can have an arbitrary number of inputs, and its output value is 1
if and only if exactly one input is 1 (and thus the others 0). Otherwise, the output is 0.

We draw an exclusive-or-gate like this:

The truth table for an exclusive-or-gate with two inputs looks like this:

So, how many different types of gates are there?

A valid question at this point is how many different kinds of gates there are, and what they are called.

n inputs. The truth tables for such gates have 2n lines. Such a gate is completely
Let us limit ourselves to gates with
defined by the output column in the truth table. The output column can be viewed as a string of 2n binary digits.
How many different strings of binary digits of length 2n are there? The answer is 22n, since there are 2k
different strings of k binary digits, and if k=2n, then there are 22n such strings. In particular, if n=2, we can see
that there are 16 different types of gates with 2 inputs.

Most of these gates do not have any names, and indeed, most of them are quite useless. For completeness, let us look at all 16 and
study the functions they compute. Each entry in the following table is specified by the output string:

0000 A gate that ignores both its inputs and always gives 0 on the output. This gate does not require any
circuits. Just let the inputs hang and connect the output to a 0.
0001 This is the and-gate described above.
0010 This is like an and-gate with an inverter on the second input.
0011 This gate ignores its second input, and gives as output the value of its first input. It does not require any
circuits. Just connect the output to the first input and let the second input hang.
0100 This is like an and-gate with an inverter on the first input.
0101 This gate ignores its first input, and gives as output the value of its second input. It does not require any
circuits. Just connect the output to the second input and let the first input hang.
0110 This is the exclusive-or-gate described above.
0111 This is the or-gate described above.
1000 This is the nor-gate described above.
1001 This is like an exclusive-or-gate with an inverter on its output.
1010 This gate can be built with an inverter on the second input, and with the first input hanging.
1011 This is like an or-gate with an inverter on its second input.
1100 This gate can be built with an inverter on the first input, and with the second input hanging.
1101 This is like an or-gate with an inverter on its first input.
1110 This is the nand-gate described above.
1111 This is a gate that ignores both its inputs and always gives a 1 as output. This gate does not require any
circuits. Just let the inputs hang and connect the output to a 1.
As you can see, most of the gates possible, are quite useless.

Doing it all with only one kind of gate

nand-gates. To see this, first observe that an inverter is
As it turns out, it is possible to build any kind of gate using only
just a nand-gate with only one input. Second, an and-gate can be built as a nand with an inverter on its output.
Last, an or-gate can be build with a nand-gate with an inverter on each input.

In some circuit technology, it is actually easier

to build a nand-gate than any of the other gates. In that case, the nand-
gate is considered the most primitive building block, and everything else is built from it.

Similarly, all gates can be realized from only nor-gates.

Again an inverter is just a nor-gate with only one input. An or-
gate is a nor-gate with an inverter on its output, and an and-gate is just a nor-gate with an inverter on each
Client-Server Technology
Client-Server Technologies

Client/Server technology is a means for separating the functions of an application into two or more distinct parts. The
client presents and manipulates data on the desktop computer. The server acts like a mainframe to store and retrieve
protected data. Together each machine can perform the duties it is best at.

Overview of the Client-Server Technology Complex

Here's a little map of Client-Server technologies. In this map, hardware is represented in blue. Software is represented in red. The
client computer (the hardware sitting perhaps on your desktop) communicates across the Internet (by modem or ethernet) with a
server computer (usually at a remote site).

The client software on the client computer is a web browser or a telnet program. It contacts software called servers on the server
computer. These servers are usually daemons (a name for software that is always running). The client software and the server
software communicate by means of a protocol (an established pattern of behaviors and responses).

So for example, a web browser may contact a web server using the HTTP protocol. The browser sends a request for a particular
web page. The web server (an http daemon) logs the request and then finds the correct file and sends it back across the Internet to
the browser. The browser receives the file and interprets it for the user by supplying images, format, color, and so on.

To spice things up, Helper Applications run on the client's side to help show files that require special handling, such as movies, or
compressed files, or Acrobat (.pdf) files. Plug-Ins are software that actually run inside the web brower to display special files.

On the server side, Common Gateway Interface (CGI) programs can access various files and other programs on the server
computer and send dynamically created web pages back across the Internet. For example, this is the way search engines work:
each search creates a new page that doesn't already exist as a file.

Client-server is a computing architecture which separates a client from a server, and is almost always
implemented over a computer network. Each client or server connected to a network can also be referred to
as a node. The most basic type of client-server architecture employs only two types of nodes: clients and
servers. This type of architecture is sometimes referred to as two-tier. It allows devices to share files and
Each instance of the client software can send data requests to one or more connected servers. In turn, the servers can
accept these requests, process them, and return the requested information to the client. Although this concept
can be applied for a variety of reasons to many different kinds of applications, the architecture remains
fundamentally the same.
These days, clients are most often web browsers, although that has not always been the case. Servers typically include web
servers, database servers and mail servers. Online gaming is usually client-server too. In the specific case of MMORPG, the
servers are typically operated by the company selling the game; for other games one of the players will act as the host by setting
his game in server mode.
The interaction between client and server is often described using sequence diagrams. Sequence diagrams are standardized in the
Unified Modeling Language.

Characteristics of a client
Initiates requests
Waits for and receives replies
Usually connects to a small number of servers at one time
Typically interacts directly with end-users using a graphical user interface

Characteristics of a server
Passive (slave)
Waits for requests from clients
Upon receipt of requests, processes them and then serves replies
Usually accepts connections from a large number of clients
Typically does not interact directly with end-users

Multi-tiered architecture
Some designs are more sophisticated and consist of three different kinds of nodes: clients, application servers which process data
for the clients, and database servers which store data for the application servers. This configuration is called a three-tier
architecture, and is the most commonly used type of client-server architecture. Designs that contain more than two tiers are
referred to as multi-tiered or n-tiered.

The advantages of n-tiered

architectures is that they are far more scalable, since they balance and distribute the
processing load among multiple, often redundant, specialized server nodes. This in turn improves overall
system performance and reliability, since more of the processing load can be accommodated simultaneously.

The disadvantages of n-tiered architectures include:

1. More load on the network itself, due to a greater amount of network traffic.
2. More difficult to program and test than in two-tier architectures because more devices have to communicate in order to
complete a client's request.

Comparison to Peer-to-Peer Architecture

Another type of network architecture is known as peer-to-peer, because each node or instance of the program can simultaneously
act as both a client and a server, and because each has equivalent responsibilities and status. Peer-to-peer architectures are often
abbreviated using the acronym P2P.

Both client-server and P2P architectures are in wide usage today.

Comparison to Client-Queue-Client Architecture

While classic Client-Server architecture requires one of communication endpoints to act as a server, which is much harder to
implement, Client-Queue-Client allows all endpoints to be simple clients, while the server consists of some external software,
which also acts as passive queue (one software instance passes its query to another instance to queue, e.g. database, and then this
other instance pulls it from database, makes a response, passes it to database etc.). This architecture allows greatly simplified
software implementation. Peer-to-Peer architecture was originally based on Client-Queue-Client concept.

In most cases, a client-server architecture enables the roles and responsibilities of a computing system to be distributed among
several independent computers that are known to each other only through a network. This creates an additional advantage to this
architecture: greater ease of maintenance. For example, it is possible to replace, repair, upgrade, or even relocate a server while its
clients remain both unaware and unaffected by that change. This independence from change is also referred to as encapsulation.

All the data are stored on the servers, which generally have far greater security controls than most clients. Servers can better
control access and resources, to guarantee that only those clients with the appropriate permissions may access and change data.
Since data storage is centralized, updates to those data are far easier to administer than would be possible under a P2P paradigm.
Under a P2P architecture, data updates may need to be distributed and applied to each "peer" in the network, which is both time-
consuming and error-prone, as there can be thousands or even millions of peers.
Many mature client-server technologies are already available which were designed to ensure security, 'friendliness' of the user
interface, and ease of use.
It functions with multiple different clients of different capabilities.

Traffic congestion on the network has been an issue since the inception of the client-server paradigm. As the number of
simultaneous client requests to a given server increases, the server can become severely overloaded. Contrast that to a P2P
network, where its bandwidth actually increases as more nodes are added, since the P2P network's overall bandwidth can be
roughly computed as the sum of the bandwidths of every node in that network.
The client-server paradigm lacks the robustness of a good P2P network. Under client-server, should a critical server fail, clients’
requests cannot be fulfilled. In P2P networks, resources are usually distributed among many nodes. Even if one or more nodes
depart and abandon a downloading file, for example, the remaining nodes should still have the data needed to complete the

Imagine you are visiting an eCommerce web site. In this case, your computer and web browser would be considered the
while the computers, databases, and applications that make up the online store would be considered the
server. When your web browser requests specific information from the online store, the server finds all of
the data in the database needed to satisfy the browser's request, assembles that data into a web page, and
transmits that page back to your web browser for you to view.
Specific types of clients include web browsers, email clients, and online chat clients.

Specific types of servers include web servers, application servers, database servers, mail servers, file servers, print servers, and
terminal servers. Most web services are also types of servers.

What is Spooling?

Acronym for simultaneous peripheral operations on-line, spooling refers to putting jobs in a buffer, a
special area in memory or on a disk where a device can access them when it is ready. Spooling is useful
because devices access data at different rates. The buffer provides a waiting station where data can rest
while the slower device catches up.
The most common spooling application is print spooling. In print spooling, documents are loaded into a
buffer (usually an area on a disk), and then the printer pulls them off the buffer at its own rate. Because the
documents are in a buffer where they can be accessed by the printer, you can perform other operations on
the computer while the printing takes place in the background.

Spooling also lets you place a number of print jobs on a queue instead of waiting for each one to finish
before specifying the next one.


simultaneous peripheral operations on-line, (Spooling) is a device management technique used in
operating system in multiuser and other time sharing systems.

In Spooling, all the work a device is supposed to do is saved in a special memory area. Device then
perform the job request sequentially at high speed.

One example of spooling is usage of printers in shared systems. All data to be printed from different
printers is saved in memory (on disk) and then finally printer prints all of them sequentially at high

Electronic Data Interchange (EDI)
Electronic Data Interchange (EDI) is 'the exchange of documents in standardised electronic form,
between organisations, in an automated manner, directly from a computer application in one
organisation to an application in another'

Benefits of EDI
EDI's saves unneccessary re-capture of data. This leads to faster transfer of data, far fewer errors,
less time wasted on exception-handling, and hence a more stream-lined business process.

Benefits can be achieved in such areas as inventory management, transport and distribution,
administration and cash management. EDI offers the prospect of easy and cheap communication of
structured information throughout the government community, and between government agencies and
their suppliers and clients.

EDI can be used to automate existing processes. In addition, the opportunity can be taken to rationalise
procedures, and thereby reduce costs, and improve the speed and quality of services.

Because EDI necessarily involves business partners, it can be used as a catalyst for gaining efficiencies
across organisational boundaries. This strategic potential inherent in EDI is expected to be, in the
medium term, even more significant that the short-term cost, speed and quality benefits.


- EDI (Electronic Data Interchange) is a standard format for exchanging business data. The
standard is ANSI X12 and it was developed by the Data Interchange Standards Association. ANSI X12 is
either closely coordinated with or is being merged with an international standard, EDIFACT.

An EDI message contains a string of data elements, each of which represents a singular fact, such as a price,
product model number, and so forth, separated by delimiter. The entire string is called a data segment. One
or more data segments framed by a header and trailer form a transaction set, which is the EDI unit of
transmission (equivalent to a message). A transaction set often consists of what would usually be contained
in a typical business document or form. The parties who exchange EDI transmissions are referred to as
trading partners.

EDI messages can be encrypted. EDI is one form of e-commerce, which also includes e-mail and fax.
Operating system
Operating system

Operating System is a software or set of programs the mediate access between physical devices
and applications programs. Some of its characteristics are multi-tasking, multi-processing, multi-user,
protected mode, support for graphics, and built-in support for networks.

Operating System composed of different functions namely:

1.) Processor Management
2.) Memory Management
3.) Housekeeping
4.) User Interface
5.) Storage Management
6.) Device Management
7.) Job Sequencing
8.) Job Control
9.) Job Sequencing
10) Error Handling
11.) I/O Handling
12.) Interrupt Handling
13.) Scheduling
14.) Resource Control
15.) Protection
Some examples of operating systems are UNIX, Mach, MS-DOS, MS-Windows, Windows/NT, Chicago, OS/2, MacOS, VMS,
MVS, and VM.


A Computer System is made up of Hardware, Operating System and user interface. Computer Software can
be divided into System programs which manage the operation of the computer itself and the application
programs, which solve problems for their users.

Operating System is the most fundamental of all the System programs. The Operating System controls the
entire Computers’ resources and provides the base upon which the application programs can be written.
Operating System has been defined in different ways by different people. Some of the definitions are: An
Operating System is program that makes the computing power available to users by controlling the hardware.
Another definition is that Operating System is a program that controls the execution of application programs.
It masks the details of the hardware to application programs.

Operating System can be summarily defined as: a set of processes permanently or transitively resident within
the Computer that makes the resources of the computer system available to the user in a consistent, reliable,
friendly way. In essence it should be a Resource optimiser and operation

Operating System can be divided into the Kernel and the Operating System. The kernel is the essential centre
of a computer operating system, the core that provides basic services for all other parts of the operating
system. A synonym is nucleus. A kernel can be contrasted with a shell, the outermost part of an operating
system that interacts with user commands. Kernel and shell are terms used more frequently in UNIX and
some other operating systems than in IBM mainframe systems.


Single User
Single-user allows one single user to login at a time. There is no user account database which makes the level
of security low and so users cannot protect their files from being viewed, copied or deleted. Examples of this
type are DOS and Windows 98.


The Multi-user has a user database account which states the right that users have on certain resources. They
are more secure than the single user since access is limited. Example of this is UNIX

Networked/Work alone/
Stand alone are usually not connected to a network and thus cannot access networked resources. They are
usually more secured than and remote users cannot log into the computer. A network operating System uses a
standard communication protocol for UNIX networks and over the internet( we have the TCP/IP). For Novell
Netware internetwork Packet Exchange /Sequenced Packet Exchange (IPX/SPX). They are less secure than
Standalone and should be protected (most times by creating user accounts. More examples are Windows NT
5.0, Windows 98,


multitasking allows one or more programs to run, at time. Each process is given a prioritised amount of time
on the processor. Single-user allows are programs to run at a time. It is usually faster than multi-tasking
Systems +as some time is taken to in switching in multi-user. But multi-user is more efficient as they allow
other tasks to run when a task is not performing any operation. Single-users are more robust, as multi-
programs are require to communicate with each other which can cause synchronization problems (deadlock)
Multi Processor/Single Processor: Some Operating System allows for more than one processor to use on the
system. This allows more than one task to be run, at a time, on different processors. Windows NT/2000
supports multi processors (up to 4).

Single Processor, Multitasking involves running each of the processor for a given time slice or a single
processor where with multi processor, they can all run at the same.


Usually Operating system is there to manage all pieces of a complex system
Imagine what will happen if three programs running on a computer all try to print their output simultaneously
on the same printer. The first few lines might be from program one and the next few lines might be from
program two and so on with the result resulting in chaos. The operating system can bring order to potential
chaos by buffering all the output destined for the printer on the disk.

The primary task of operating system is to keep track of who is using which resource, to grant resource
request, to account for usage, and to meditate conflicting request from different programs and users.
When a computer has multiple users, the need for managing and protecting the memory, input/output devices
and other resources are even more apparent. This need arises because it is usually necessary to share
expensive resource such as tapes drives and phototypesetters.

Operating systems perform basic tasks, such as recognizing input from the keyboard, sending output to the
display screen, keeping track of files and directories on the disk, and controlling peripheral devices such as
disk drives and printers.

For large systems, the operating system has even greater responsibilities and powers. It is like a traffic cop --
it makes sure that different program and users running at the same time do not interfere with each other. The
operating system is also responsible for security, ensuring that unauthorized users do not access the system.

The set of software products that jointly controls the system resource using these resources on a computer
system is known as operating system. Examples are Unix, Windows.


Operating system should try and hide the complexity of interfacing to devices from user program and the user.
Typically an operating system should also try and configure device to start up rather than getting the user to
set them up.

File System: An Operating System can create and maintain a file System, where users can create, delete and
move files around a structured file system.
Many systems organize the files in directories (or folders). In multi-users system; these folders can have
associated user ownership, and associated access rights.


This allows one or more user to log into a system. Thus the operating system must contain a user account
database, which contains user name, default home directory, user passwords and user right.


This allows two or more processes to be used at a time. Here the operating system must decide if it can run the
different processes on individual processors. It must also manage the common memory between processors

This involves allocating, and often to create a virtual memory for program. Paging which means organizing
data so that the program data is loaded into pages of memory. Another method of managing memory is
swapping. This involves swapping the content of memory to disk storage.

Processes are often split into smaller task, named threads. This thread allows smoother operations.

In computers, a printer is a device that accepts text and graphic output from a computer and
transfers the information to paper, usually to standard size sheets of paper.

Types of Printers:
Impact Printers, Non-impact Printers, IRIS Printers, LED Printers, Light Emitting Diode Printers, Dye
Sublimation Printers, Desktop Printers, Solid Ink Printers, Host-based Printers, and Network Printers

Printers vary in size, speed, sophistication, and cost. In general, more expensive printers are used for higher-
resolution color printing.

Impact and Non-Impact Printers:

Refers to a class of printers that work by banging a head or needle against an ink ribbon to make a
mark on the paper. This includes dot-matrix printers, daisy-wheel printers, and line printers.
In contrast, laser and ink-jet printers are nonimpact printers. The distinction is important because
impact printers tend to be considerably noisier than nonimpact printers but are useful for multipart
forms such as invoices. (can be used to produce carbon copies)
7.1. Types of Printers
7.1.1. Printing Considerations

7.2. Impact Printers

7.2.1. Dot-Matrix Printers
7.2.2. Daisy-wheel Printers
7.2.3. Line Printers
7.2.4. Impact Printer Consumables

7.3. Inkjet Printers

7.3.1. Inkjet Consumables

7.4. Laser Printers

7.4.1. Color Laser Printers
7.4.2. Laser Consumables

7.5. Other Printer Types

7.6. Printer Languages and Technologies


7.2. Impact Printers

Impact printers are the oldest print technologies still in active production. Some of the largest printer
vendors continue to manufacture, market, and support impact printers, parts, and supplies. Impact printers
are most functional in specialized environments where low-cost printing is essential. The three most
common forms of impact printers are dot-matrix, daisy-wheel, and line printers.
7.2.1. Dot-Matrix Printers
The technology behind dot-matrix printing is quite simple. The paper is pressed against a drum (a rubber-
coated cylinder) and is intermittently pulled forward as printing progresses. The electromagnetically-driven
printhead moves across the paper and strikes the printer ribbon situated between the paper and printhead
pin. The impact of the printhead against the printer ribbon imprints ink dots on the paper which form
human-readable characters.
Dot-matrix printers vary in print resolution and overall quality with either 9 or 24-pin printheads. The more
pins per inch, the higher the print resolution. Most dot-matrix printers have a maximum resolution around
240 dpi (dots per inch). While this resolution is not as high as those possible in laser or inkjet printers, there
is one distinct advantage to dot-matrix (or any form of impact) printing. Because the printhead must strike
the surface of the paper with enough force to transfer ink from a ribbon onto the page, it is ideal for
environments that must produce carbon copies through the use of special multi-part documents. These
documents have carbon (or other pressure-sensitive material) on the underside and create a mark on the
sheet underneath when pressure is applied. Retailers and small businesses often use carbon copies as receipts
or bills of sale.
7.2.2. Daisy-wheel Printers
If you have ever seen or worked with a manual typewriter before, then you understand the technological
concept behind daisy-wheel printers. These printers have printheads composed of metallic or plastic wheels
cut into petals. Each petal has the form of a letter (in capital and lower-case), number, or punctuation mark
on it. When the petal is struck against the printer ribbon, the resulting shape forces ink onto the paper.
Daisy-wheel printers are loud and slow. They cannot print graphics, and cannot change fonts unless the print
wheel is physically replaced. With the advent of laser printers, daisy-wheel printers are generally not used in
modern computing environments.
7.2.3. Line Printers
Another type of impact printer somewhat similar to the daisy-wheel is the line printer. However, instead of a
print wheel, line printers have a mechanism that allows multiple characters to be simultaneously printed on
the same line. The mechanism may use a large spinning print drum or a looped print chain. As the drum or
chain are rotated over the paper's surface, electromechanical hammers behind the paper push the paper
(along with a ribbon) onto the surface of the drum or chain, marking the paper with the shape of the
character on the drum or chain.
Because of the nature of the print mechanism, line printers are much faster than dot-matrix or daisy-wheel
printers; however, they tend to be quite loud, have limited multi-font capability, and often produce lower
print quality than more recent printing technologies.
Because line printers are used for their speed, they use special tractor-fed paper with pre-punched holes
along each side. This arrangement makes continuous unattended high-speed printing possible, with stops
only required when a box of paper runs out.
7.2.4. Impact Printer Consumables
Of all the printer types, however, impact printers have relatively low consumable costs. Ink ribbons and
paper are the primary recurring costs for impact printers. Some Impact printers (usually line and dot-matrix
printers) require tractor-fed paper, which can increase the costs of operation somewhat.

7.3 Non-Impact Printers

7.3.1 Inkjet Printers

An Inkjet printer uses one of the most popular printing technologies today. The relative low cost of the
printers and multi-purpose printing abilities make it a good choice for small businesses and home offices.
Inkjets use quick-drying, water-based inks and a printhead with a series of small nozzles that spray ink on
the surface of the paper. The printhead assembly is driven by a belt-fed motor that moves the printhead
across the paper.
Inkjets were originally manufactured to print in monochrome (black and white) only. However, the
printhead has since been expanded and the nozzles increased to accommodate cyan, magenta, yellow, and
black. This combination of colors (called CMYK) allows for printing images with nearly the same quality as
a photo development lab using certain types of coated paper. When coupled with crisp and highly readable
text print quality, inkjet printers are a sound all-in-one choice for monochrome or color printing needs.
7.3.1. Inkjet Consumables
Inkjet printers tend to be low cost and scale slightly upward based on print quality, extra features, and the
ability to print on larger formats than the standard legal or letter paper sizes. While the one-time cost of
purchasing an inkjet is lower than other printer types, there is the factor of inkjet consumables that must be
considered. Because demand for inkjets is large and spans the computing spectrum from home to enterprise,
the procurement of consumables can be costly.

7.4. Laser Printers

An older technology than inkjet, laser printers are another popular alternative to legacy impact printing.
Laser printers are known for their high volume output and low cost-per-page. Laser printers are often
deployed in enterprises as a workgroup or departmental print center, where performance, durability, and
output requirements are a constant. Because laser printers service these needs so readily (and at a reasonable
cost-per-page), the technology is widely regarded as the workhorse of enterprise printing.
Laser printers share much of the same technologies as photocopiers. Rollers pull a sheet of paper from a
paper tray and through a charge roller, which gives the paper an electrostatic charge. At the same time, a
printing drum is given the opposite charge. The surface of the drum is then scanned by a laser, discharging
the drum's surface and leaving only those points corresponding to the desired text and image with a charge.
This charge is then used to force toner to adhere to the drum's surface.
The paper and drum are then brought into contact; their differing charges cause the toner to then adhere to
the paper. Finally, the paper travels between fusing rollers, which heat the paper and melt the toner, fusing it
onto the paper's surface.
7.4.1. Color Laser Printers
Color laser printers are an emerging technology created by printer manufacturers whose aim is to combine
the features of laser and inkjet technology into a multi-purpose printer package. The technology is based on
traditional monochrome laser printing, but uses additional technologies to create color images and
documents. Instead of using black toner only, color laser printers use a CMYK toner combination. The print
drum either rotates each color and lays the toner down one color at a time, or lays all four colors down onto
a plate and then passes the paper through the drum, transferring the complete image onto the paper. Color
laser printers also employ fuser oil along with the heated fusing rolls, which further bonds the color toner to
the paper and can give varying degrees of gloss to the finished image.
Because of their increased features, color laser printers are typically twice (or several times) as expensive as
monochrome laser printers. In calculating the total cost of ownership with respect to printing resources,
some administrators may wish to separate monochrome (text) and color (image) functionality to a dedicated
monochrome laser printer and a dedicated inkjet printer, respectively.
7.4.2. Laser Consumables
Depending on the type of laser printer deployed, consumable costs usually are fixed and scale evenly with
increased usage or print job volume over time. Toner comes in cartridges that are usually replaced outright;
however, some models come with refillable cartridges. Color laser printers require one toner cartridge for
each of the four colors. Additionally, color laser printers require fuser oils to bond toner onto paper and
waste toner bottles to capture toner spillover. These added supplies raise the consumables cost of color laser
printers; however, it is worth noting that such consumables, on average, last about 6000 pages, which is
much greater than comparable inkjet or impact consumable lifespans. Paper type is less of an issue in laser
printers, which means bulk purchases of regular xerographic or photocopy paper are acceptable for most
print jobs. However, if you plan to print high-quality images, you should opt for glossy paper for a
professional finish.

7.5. Other Printer Types

There are other types of printers available, mostly special-purpose printers for professional graphics or
publishing organizations. These printers are not for general purpose use, however. Because they are
relegated to niche uses, their prices (both one-time and recurring consumables costs) tend to be higher
relative to more mainstream units.

Thermal Wax Printers

These printers are used mostly for business presentation transparencies and for color proofing (creating test
documents and images for close quality inspection before sending off master documents to be pressed on
industrial four-color offset printers). Thermal wax printers use sheet-sized, belt driven CMYK ribbons and
specially-coated paper or transparencies. The printhead contains heating contacts that melt each colored wax
onto the paper as it is rolled through the printer.

Dye-Sublimation Printers
Used in organizations such as service bureaus — where professional quality documents, pamphlets, and
presentations are more important than consumables costs — dye-sublimation (or dye-sub) printers are the
workhorses of quality CMYK printing. The concepts behind dye-sub printers are similar to thermal wax
printers except for the use of diffusive plastic dye film instead of colored wax as the ink element. The
printhead heats the colored film and vaporizes the image onto specially coated paper.
Dye-sub is quite popular in the design and publishing world as well as the scientific research field, where
preciseness and detail are required. Such detail and print quality comes at a price, as dye-sub printers are
also known for their high costs-per-page.

Solid Ink Printers

Used mostly in the packaging and industrial design industries, solid ink printers are prized for their ability to
print on a wide variety of paper types. Solid ink printers, as the name implies, use hardened ink sticks that
that are melted and sprayed through small nozzles on the printhead. The paper is then sent through a fuser
roller which further forces the ink onto the paper.

The solid ink printer is ideal for prototyping and proofing new designs for product packages; as such, most
service-oriented businesses would not have a need for this type of printer.

7.6. Printer Languages and Technologies

Before the advent of laser and inkjet technology, impact printers could only print standard, justified text
with no variation in letter size or font style. Today, printers are able to process complex documents with
embedded images, charts, and tables in multiple frames and in several languages, all in one print job. Such
complexity must adhere to some format conventions. This is what spurred the development of the page
description language (or PDL) — a specialized document formatting language specially made for computer
communication with printers.
TCP/ IP protocol suite
The TCP/IP Guide - Table Of Contents TCP / IP Protocols: ICMP UDP FTP HTTP Reference Page

What is TCP/IP?

TCP/IP is the communication protocol for communication between computers connected to the

TCP/IP stands for Transmission Control Protocol / Internet Protocol.

The standard defines how electronic devices (like computers) should be connected to the Internet, and
how data should be transmitted between them.

Inside TCP/IP

Hiding inside the TCP/IP standard there are a number of protocols for handling data communication:

TCP (Transmission Control Protocol) communication between applications

UDP (User Datagram Protocol) simple communication between applications
IP (Internet Protocol) communication between computers
ICMP (Internet Control Message Protocol) for errors and statistics
DHCP (Dynamic Host Configuration Protocol) for dynamic addressing

You will learn more about these standards later in this tutorial.

TCP Uses a Fixed Connection

TCP is for communication between applications.

When an application wants to communicate with another application via TCP, it sends a communication
request. This request must be sent to an exact address. After a "handshake" between the two
applications, TCP will setup a "full-duplex" communication between the two applications.

The "full-duplex" communication will occupy the communication line between the two computers until
it is closed by one of the two applications.

UDP is very similar to TCP, but is simpler and less reliable.

IP is Connection-Less

IP is for communication between computers.

IP is a "connection-less" communication protocol. It does not occupy the communication line between
two communicating computers. This way IP reduces the need for network lines. Each line can be used
for communication between many different computers at the same time.

With IP, messages (or other data) are broken up into small independent "packets" and sent between
computers via the Internet.

IP is responsible for "routing" each packet to its destination.

IP Routers

When an IP packet is sent from a computer, it arrives at an IP router.

The IP router is responsible for "routing" the packet to its destination, directly or via another router.

The path the packet will follow might be different from other packets of the same communication. The
router is responsible for the right addressing depending on traffic volume, errors in the network, or
other parameters.

Connection-Less Analogy

Communicating via IP is like sending a long letter as a large number of small postcards, each finding
its own (often different) way to the receiver.


TCP/IP is TCP and IP working together.

TCP takes care of the communication between your application software (i.e. your browser) and your
network software.

IP takes care of the communication with other computers.

TCP is responsible for breaking data down into IP packets before they are sent, and for assembling the
packets when they arrive.

IP is responsible for sending the packets to the receiver.

TCP/IP uses 32 bits, or 4 numbers between 0 and 255 to address a computer.

IP Addresses

Each computer must have an IP address before it can connect to the Internet.

Each IP packet must have an address before it can be sent to another computer.

This is an IP address:

This might be the same IP address: www.w3schools.com

You will learn more about IP addresses and IP names in the next chapter of this tutorial.

An IP Address Contains 4 Numbers.

This is your IP address:

TCP/IP uses 4 numbers to address a computer. Each computer must have a unique 4 number address.

The numbers are always between 0 and 255. Addresses are normally written as four numbers
separated by a period like this:

TCP/IP Protocols
TCP/IP is a large collection of different communication protocols.

A Family of Protocols

TCP/IP is a large collection of different communication protocols based upon the two original protocols
TCP and IP.

TCP - Transmission Control Protocol

TCP is used for transmission of data from an application to the network.

TCP is responsible for breaking data down into IP packets before they are sent, and for assembling the
packets when they arrive.

IP - Internet Protocol

IP takes care of the communication with other computers.

IP is responsible for the sending and receiving data packets over the Internet.

HTTP - Hyper Text Transfer Protocol

HTTP takes care of the communication between a web server and a web browser.

HTTP is used for sending requests from a web client (a browser) to a web server, returning web
content (web pages) from the server back to the client.


HTTPS takes care of secure communication between a web server and a web browser.

HTTPS typically handles credit card transactions and other sensitive data.

SSL - Secure Sockets Layer

The SSL protocol is used for encryption of data for secure data transmission.

SMTP - Simple Mail Transfer Protocol

SMTP is used for transmission of e-mails.

MIME - Multi-purpose Internet Mail Extensions

The MIME protocol lets SMTP transmit multimedia files including voice, audio, and binary data across
TCP/IP networks.

IMAP - Internet Message Access Protocol

IMAP is used for storing and retrieving e-mails.

POP - Post Office Protocol

POP is used for downloading e-mails from an e-mail server to a personal computer.

FTP - File Transfer Protocol

FTP takes care of transmission of files between computers.

NTP - Network Time Protocol

NTP is used to synchronize the time (the clock) between computers.

DHCP - Dynamic Host Configuration Protocol

DHCP is used for allocation of dynamic IP addresses to computers in a network.

SNMP - Simple Network Management Protocol

SNMP is used for administration of computer networks.

LDAP - Lightweight Directory Access Protocol

LDAP is used for collecting information about users and e-mail addresses from the internet.
ICMP - Internet Control Message Protocol

ICMP takes care of error handling in the network.

ARP - Address Resolution Protocol

ARP is used by IP to find the hardware address of a computer network card based on the IP address.

RARP - Reverse Address Resolution Protocol

RARP is used by IP to find the IP address based on the hardware address of a computer network card.

BOOTP - Boot Protocol

BOOTP is used for booting (starting) computers from the network.

PPTP - Point to Point Tunneling Protocol

PPTP is used for setting up a connection (tunnel) between private networks

TCP/IP Email
Email is one of the most important uses of TCP/IP.

When you write an email, you don't use TCP/IP.

When you write an email, you use an email program like Lotus Notes, Microsoft Outlook or Netscape

Your Email Program Does

Your email program uses different TCP/IP protocols:

It sends your emails using SMTP

It can download your emails from an email server using POP
It can connect to an email server using IMAP

SMTP - Simple Mail Transfer Protocol

The SMTP protocol is used for the transmission of e-mails. SMTP takes care of sending your email to
another computer.

Normally your email is sent to an email server (SMTP server), and then to another server or servers,
and finally to its destination.

SMTP can only transmit pure text. It cannot transmit binary data like pictures, sounds or movies.

SMTP uses the MIME protocol to send binary data across TCP/IP networks. The MIME protocol converts
binary data to pure text.

POP - Post Office Protocol

The POP protocol is used by email programs (like Microsoft Outlook) to retrieve emails from an email

If your email program uses POP, all your emails are downloaded to your email program (also called
email client), each time it connects to your email server.

IMAP - Internet Message Access Protocol

The IMAP protocol is used by email programs (like Microsoft Outlook) just like the POP protocol.

The main difference between the IMAP protocol and the POP protocol is that the IMAP protocol will not
automatically download all your emails each time your email program connects to your email server.

The IMAP protocol allows you to see through your email messages at the email server before you
download them. With IMAP you can choose to download your messages or just delete them. This way
IMAP is perfect if you need to connect to your email server from different locations, but only want to
download your messages when you are back in your office.


TCP/IP (Transmission Control Protocol/Internet Protocol) is the basic communication language or protocol
of the Internet. It can also be used as a communications protocol in a private network (either an intranet or
an extranet). When you are set up with direct access to the Internet, your computer is provided with a copy
of the TCP/IP program just as every other computer that you may send messages to or get information from
also has a copy of TCP/IP.

TCP/IP is a two-layer program. The higher layer, Transmission Control Protocol, manages the assembling
of a message or file into smaller packets that are transmitted over the Internet and received by a TCP layer
that reassembles the packets into the original message. The lower layer, Internet Protocol, handles the
address part of each packet so that it gets to the right destination. Each gateway computer on the network
checks this address to see where to forward the message. Even though some packets from the same message
are routed differently than others, they'll be reassembled at the destination.

TCP/IP uses the client/server model of communication in which a computer user (a client) requests and is
provided a service (such as sending a Web page) by another computer (a server) in the network. TCP/IP
communication is primarily point-to-point, meaning each communication is from one point (or host
computer) in the network to another point or host computer. TCP/IP and the higher-level applications that
use it are collectively said to be "stateless" because each client request is considered a new request unrelated
to any previous one (unlike ordinary phone conversations that require a dedicated connection for the call
duration). Being stateless frees network paths so that everyone can use them continuously. (Note that the
TCP layer itself is not stateless as far as any one message is concerned. Its connection remains in place until
all packets in a message have been received.)

Many Internet users are familiar with the even higher layer application protocols that use TCP/IP to get to
the Internet. These include the World Wide Web's Hypertext Transfer Protocol (HTTP), the File Transfer
Protocol (FTP), Telnet (Telnet) which lets you logon to remote computers, and the Simple Mail Transfer
Protocol (SMTP). These and other protocols are often packaged together with TCP/IP as a "suite."

Personal computer users with an analog phone modem connection to the Internet usually get to the Internet
through the Serial Line Internet Protocol (SLIP) or the Point-to-Point Protocol (PPP). These protocols
encapsulate the IP packets so that they can be sent over the dial-up phone connection to an access provider's

Protocols related to TCP/IP include the User Datagram Protocol (UDP), which is used instead of TCP for
special purposes. Other protocols are used by network host computers for exchanging router information.
These include the Internet Control Message Protocol (ICMP), the Interior Gateway Protocol (IGP), the
Exterior Gateway Protocol (EGP), and the Border Gateway Protocol (BGP).

What is e-mail?
E-mail (short for electronic mail; often also abbreviated as e-mail, email or simply mail) is a store and
forward method of composing, sending, storing, and receiving messages over electronic communication
systems. The term "e-mail" (as a noun or verb) applies both to the Internet e-mail system based on the
Simple Mail Transfer Protocol (SMTP) and to X.400 systems, and to intranet systems allowing users
within one organization to e-mail each other. Often these workgroup collaboration organizations may use
the Internet protocols or X.400 protocols for internal e-mail service. E-mail is often used to deliver bulk
unsolicited messages, or "spam", but filter programs exist which can automatically delete some or most of
these, depending on the situation.

In its simplest form, e-mail is an electronic message sent from one device to another. While most
messages go from computer to computer, e-mail can also be sent and received by mobile phones,
PDAs and other devices. With e-mail, you can send or receive personal and business-related
messages with attachments, such as photos or formatted documents. You can also send music, video
clips and software programs.

It can take days to send a letter across the country and weeks to go around the world. To save time
and money, more and more people are relying on electronic mail. It's fast, easy and much cheaper
than the using the post office.
E-mail is the way to go. It's no wonder e-mail has become the most popular service on the Internet.

Follow the Trail

Just as a letter makes stops at different postal stations along the way to its final destination, e-mail passes from one
computer, known as a mail server, to another as it travels over the Internet. Once it arrives at the destination mail server,
it's stored in an electronic mailbox until the recipient retrieves it. This whole process can take seconds, allowing you to
quickly communicate with people around the world at any time of the day or night.

Sending and Receiving Messages

To receive e-mail, you need an account on a mail server. This is similar to having a street address where you receive
letters. One advantage over regular mail is that you can retrieve your e-mail from any location on earth, provide that you
have Internet access. Once you connect to your mail server, you just download your messages to your computer or
wireless device.

To send e-mail, you need a connection to the Internet and access to a mail server that forwards your mail. The standard protocol
used for sending Internet e-mail is called SMTP, short for Simple Mail Transfer Protocol. It works in conjunction with POP--
Post Office Protocol--servers. Almost all Internet service providers and all major online services offer at least one e-mail
address with every account.

When you send an e-mail message, your computer routes it to an SMTP server. The server looks at the e-mail address (similar to
the address on an envelope), then forwards it to the recipient's mail server, where it's stored until the addressee retrieves it. You
can send e-mail anywhere in the world to anyone who has an e-mail address.

At one time, Internet e-mail was good only for text messages. You couldn't send attachments, such as formatted documents.
With the advent of MIME, which stands for Multipurpose Internet Mail Extension, and other types of encoding schemes, such as
UUencode, not only can you send messages electronically, but you can also send formatted documents, photos, audio and video
files. Just make sure that the person to whom you send the attachment has the software capable of opening the file.


Mail transfer agent:

A mail transfer agent or MTA (also called a mail transport agent, message transfer agent, mail server,
SMTPD (short for SMTP daemon), or a mail exchanger (MX) in the context of the Domain Name System)
is a computer program or software agent that transfers electronic mail messages from one computer to

It receives messages from another MTA (relaying), a mail submission agent (MSA) that itself got the mail
from a mail user agent (MUA), or directly from an MUA, thus acting as an MSA itself. The MTA works
behind the scenes, while the user usually interacts with the MUA.

The delivery of e-mail to a user's mailbox typically takes place via a mail delivery agent (MDA); many
MTAs have basic MDA functionality built in, but a dedicated MDA like procmail can provide more
Mail user agent:
A mail user agent (MUA) functions by connecting to a mailbox into which e-mail has been fetched and
stored in a particular format. It typically presents a simple user interface to perform tasks with the mail. An
MUA by itself is incapable of sending or retrieving mail.

Mail delivery agent

A Mail Delivery Agent (MDA) is software that accepts incoming e-mail messages and distributes them to
recipients' individual mailboxes (if the destination account is on the local machine), or forwards to another
SMTP server (if the destination is on a remote server).
A mail delivery agent is not necessarily a mail transfer agent (MTA), although on many systems the two
functions are implemented by the same program.
On Unix systems, procmail and maildrop are the most popular MDAs. LMTP is a protocol that is frequently
implemented by network-aware MDAs.

Information System
What is meant by Information Systems? How are they different from file system?

An Information System is a set of elements each of which is capable of carrying out certain
information processing. All the elements processing the task of information processing together for achieving
a desired objective constitute an linformation system. A personal computer is an example of infromation
system. An information system is general consistes of the following units:

1. Logical Organisational Units

2. Computer hardware and software
3. Information related personnels

File systems
File system is concern with the logical organisation of information. File system deals with collection of
unstructured and uninterpreted information. Each seperately identified collection and information is called
the file. A database management system is a file system that performs structuring of information. we can
functionally divide the file system into seven logical phases.
They are:
1.Accessing Methods
2. Logival File system
3. Basic File System
4. File organisation strategy
5. Allocation System strategy
6. Device strategy
7. I/O Control system

A programmer makes a request to read a file. He use a symbolic name. The logical file system accepts the
symbolic name and finds the corresponding numeric file identifier. The basic file system takes the numeric
file identifier and obtain a file descriptor, The File organissation strategy module uses the descriptor to
determine the physical I/O commands to access the information ( from where it is stored). I/O control system
scheduled the execution of the physical file commands.

How is strategic Information System useful for decision making?
A high-level information system is one that supports decision-making by providing past and future data to a
client program. The client program aids the decision maker by incorporating analysis and planning
algorithms that assess the value of alternate decisions to be made now or at points in the near future.

During that process effective access is needed to past and current information, and to forecasts about the
future, given information about current state and decisions that may be made, typically about resource
allocations: money, people, and supplies.

Data covering past information has to be selected, aggregated and transformed to be effective in decision-
making. Because of the volume of resources mediating modules are often required. A mediator is a software
module that exploits encoded knowledge about certain sets or subsets of data to create information for a
higher layer of applications. A mediator module should be small and simple, so that it can be maintained by
one expert or, at most, a small and coherent group of experts. The results are most effectively represented by
a timeline [DasSTM:94]. The common assumptions is that there is only one version of the past, and that it
can be objectively determined.

Recent data may be reported by messages, especially where databases cannot be updated instantaneously.
The database paradigm is strong on consistency, but that may mean that recent information, that is not
complete or yet verified, may not be retrievable from a formal database. To the decision-maker, however
such information is valuable, since any information can lower the uncertainty when one has to project into
the future.

Cache memory

Cache (pronounced cash) memory is extremely fast memory that is built into a computer’s central
processing unit (CPU), or located next to it on a separate chip. The CPU uses cache memory to store
instructions that are repeatedly required to run programs, improving overall system speed. The advantage of
cache memory is that the CPU does not have to use the motherboard’s system bus for data transfer.
Whenever data must be passed through the system bus, the data transfer speed slows to the motherboard’s
capability. The CPU can process data much faster by avoiding the bottleneck created by the system bus.

As it happens, once most programs are open and running, they use very few resources. When these resources
are kept in cache, programs can operate more quickly and efficiently. All else being equal, cache is so
effective in system performance that a computer running a fast CPU with little cache can have lower
benchmarks than a system running a somewhat slower CPU with more cache. Cache built into the CPU
itself is referred to as Level 1 (L1) cache. Cache that resides on a separate chip next to the CPU is called
Level 2 (L2) cache. Some CPUs have both L1 and L2 cache built-in and designate the separate cache chip as
Level 3 (L3) cache.

Cache that is built into the CPU is faster than separate cache, running at the speed of the microprocessor
itself. However, separate cache is still roughly twice as fast as Random Access Memory (RAM). Cache is
more expensive than RAM, but it is well worth getting a CPU and motherboard with built-in cache in order
to maximize system performance.

Disk caching applies the same principle to the hard disk that memory caching applies to the CPU.
Frequently accessed hard disk data is stored in a separate segment of RAM in order to avoid having to
retrieve it from the hard disk over and over. In this case, RAM is faster than the platter technology used in
conventional hard disks. This situation will change, however, as hybrid hard disks become ubiquitous. These
disks have built-in flash memory caches. Eventually, hard drives will be 100% flash drives, eliminating the
need for RAM disk caching, as flash memory is faster than RAM.


What is a Cache?
Computer caches are memory circuits that serve to speed up a much larger memory drive. In a typical computer it takes the
microprocessor 60 nanoseconds to access the RAM. To cut the time it takes for the microprocessor to access data from the RAM,
a special memory bank or memory circuit is installed into the motherboard itself. This is called an L2 cache and it can deliver the
needed data in 30 nanoseconds, which is half the speed of the main memory.

Caches are small memory banks that serve to speed up a larger memory bank by being nearer and faster to the processor. To
further speed up the operations in the microprocessor, an L1 cache is installed right on top of the microprocessor. This makes the
operation dependent only on the speed of the microprocessor and not on the speed of any memory bus. An L1 cache, therefore, is
3.5 times faster than an L2 bus.

The internet is the slowest and biggest "memory drive" that a personal computer can access. To speed up its operations, the
computer stores temporary files on the previous pages viewed in the internet. This operation uses the hard drive as a caching
subsystem and results in faster access to data the next time a page is viewed.
Modern hard drives have their own caching subsystems that are accessed before the physical drive is. This results in much faster
retrieval of data. A very concrete example of this is when the hard drive is used as a caching subsystem to the files in a floppy disk
drive. A 3-megabyte file in a floppy drive typically takes 20 seconds to display the first time it is accessed. It displays a lot faster
the second time around, though, because the operating system checks with the hard drive first if it contains a copy of the same file
before accessing the floppy disk. Since the hard drive is much faster to access than the floppy drive, the floppy drive will not be
searched the second time the same data is retrieved.

A summary of the typical caching subsystems of a computer from the slowest to the fastest will be the following. The internet
which could take between 1 second to several hours to download data, the mechanical hard drive which takes 12 milliseconds to
access the same data, the main memory or RAM which takes around 60 nanoseconds to access 32 MB to 512 MB-sized data, the
l2 cache or SRAM type memory that can access 128 KB to 512 KB-sized data in about 20 to 30 nanoseconds, and the L1 cache
which can access 4 KB to 16 KB-sized data in 10 nanoseconds or less depending on the speed of the microprocessor.

Small memory banks are a lot faster but slower and bigger memory banks are a lot cheaper and are thus more practical to use.
Caching offsets this disadvantage and maximizes speed of the system by making frequently used data readily accessible to the

Downloaded from : www.amieeducation.com

For more quality downloads..