Com 312 Lecture Notes2010

Chapter 1
INTRODUCTION TO DATABASE
Learning Objectives: After reading and studying this chapter you should be able to: Define Database Define Database management system (DBMS) State various Database system applications Explain purpose of database system Explain the factors driving development of database systems Describe briefly the history of database systems Explain modern organizations needs for information State the benefits of DBMS A database can be termed as a repository (or a store) of data. A collection of actual data which constitutes the information regarding an organisation is stored in a database (see figure 1.1). For example, there are 1000 students in a college & we have to store their personal details, marks details etc., these details will be recorded in a database. Database: A Formal Definition Several definitions could be given for the term database, but let us examine the following definition: A database is an ordered collection of related data elements intended to meet the information needs of an organization and designed to be shared by multiple users. Note the key terms in the definition: Ordered collection. A database is a collection of data elements. Not just a random assembly of data structures, but a collection of data elements put together deliberately with proper order. The various data elements are linked together in the most logical manner. Related data elements. The data elements in a database are not disjointed structures without any relationships among them. These are related among themselves and also pertinent to the particular organization. Information needs. The collection of data elements in a database is there for a specific purpose. That purpose is to satisfy and meet the information needs of the organization. In
1
1.1 What is a Database?
a database for a bank, you will find data elements that are pertinent to the banks business. You will find customers bank balances and ATM transactions. You will not find data elements relating to a students major and examination grades that belong in a database for a university. You will not find a patients medical history that really belongs in a database for a medical center. Shared. All authorized users in an organization can share the information stored in its database. Integrated information is kept in the database for the purpose of sharing so that all user groups may collaborate and accomplish the organizations objectives
Actual data storage
Figure 1.1: A database on a disk 1.2 What is a Database Management System(DBMS) There is need to differentiate between a database and database management system (DBMS). A DBMS is collection of programs that enables you to store, modify, and extract important information from a database. There are many different types of DBMS, ranging from small systems that run on personal computers to huge systems that run on mainframes. The primary goal of a DBMS is to provide a way to store & retrieve database information that is both convenient & efficient. Database systems are designed to manage large bodies of information. Management of data involves both defining structures for storage of information & providing way for manipulation of data. In addition, the database system must ensure safety of data. As you will observe from figure 1.2 below, good data management is an essential prerequisite to corporate success. This is one reason you see organizations today are investing heavily to have database products and professionals in their businesses. Data Information Knowledge Judgment Decision Information Knowledge Judgment Decision Success
Provided that data is: complete, accurate, timely, and easily available Figure1.2: From Data to Success
2
1.3 Database System Applications

There are many different types of DBMSs, ranging from small systems that run on personal computers to huge systems that run on mainframes. Databases are applied in wide no. of applications. Following are some of the examples: Banking: For customer information, accounts, loans & other banking transactions Airlines: For reservation & schedule information Universities: For student information, course registration, grades etc. Credit card transaction: For purchase of credit cards & generation of monthly statements. Telecommunication: For keeping records of calls made, generating monthly bill etc. Finance: For storing information about holdings, sales & purchase of financial statements Sales: For customer, product & purchase information Manufacturing: For management of supply chain. Human Resource: For recording information about employees, salaries, tax, benefits etc. We can say that whenever we need to have a computerised system, we need a database system 1.4 Purpose of Database system Before the evolution of DBMS, organisations used to store information in file-oriented systems. A file-oriented system is one in which we keep the information in files. A typical file processing system is supported by a conventional operating system. The system stores permanent records in various files & it need application program to extract records, or to add or delete records. We will compare both systems with the help of an example. Assume there is a saving bank enterprise that keeps information about all customers & saving accounts. Following manipulations has to be done with the system A program to debit or credit an account A program to add a new account. A program to find balance of an account. A program to generate monthly statements. As the need arises new applications can be added at a particular point of time for example current account deposit can be added into a saving account of customers who want lodge cheque into their savings. Using file system for storing data has got following disadvantages:3
1.Data Redundancy & Inconsistency:Different programmers work on a single project, so various files is created by different programmers at some interval of time. These files are created in different formats & different programs are written in different programming languages. Also same information is repeated in these different files. For example name & address may appear in saving account file as well as in current account file. This redundancy results in higher storage space & access cost. It also leads to data inconsistency which means that if we change some record in one place the change will not be reflected in all the places. For example, a changed customer address may be reflected in saving record but not anywhere else. 2. Difficulty in accessing data Accessing data from a list is also a difficulty in file-oriented system. Suppose we want to see the records of all customers who has a balance less than N10,000, we can either check the list & find the names manually or write an application program. If we write an application program & at some later time, we need to see the records of customer who have a balance of less than N20,000, then again a new program has to be written. It means that file processing system do not allow data to be accessed in a convenient manner. 3. Data Isolation As the data is stored in various files, & various files may be stored in different format, writing application program to retrieve the data is difficult. 4. Integrity Problems Sometimes, we need that data stored should satisfy certain constraints as in a bank a minimum deposit should be of N1000. Developers enforce these constraints by writing appropriate programs but if later on some new constraint has to be added then it is difficult to change the programs to enforce them. 5. Atomicity Problems Any mechanical or electrical device is subject to failure, and so is the computer system. In this case we have to ensure that data should be restored to a consistent state. For example an amount of N50 has to be transferred from Account A to Account B. Assume this amount has been debited from account A but have not been credited to Account B and in the mean time, some failure occurred. So, it will lead to an inconsistent state. So, we have to adopt a mechanism which ensures that either full transaction should
4
be executed or no transaction should be executed i.e. the fund transfer should be atomic. 6. Concurrent access Problems Many systems allow multiple users to update the data simultaneously. It can also lead to data being in an inconsistent state. Suppose a bank account contains a balance of N500 & two customers want to withdraw N100 & N50 simultaneously from that same balance. Both the transaction reads the old balance( i.e. 500) & withdraw from that old balance which will result in N 450 & N 400 which is incorrect. 7. Security Problems All the user of database should not be able to access all the data. For example a payroll personnel needs to access only that part of data which has information about various employees & are not needed to access information about customer accounts. 1.5 The Driving Forces for Database system Development Among others, four major forces drove organizations to adopt database systems (see Figure 1-3 ) Information as a Corporate Asset. Today, companies strongly realize that information is a corporate asset similar to other assets such as cash, plant and equipment, or inventory. Proper management of key assets is essential for success. Companies understand that it is essential to manage information as a key asset. They understand the need to find improved methods for storing, retrieving, and using information. Explosive Growth of Computer Technology. Computer technology, especially data storage and retrieval systems, has grown in a phenomenal manner. Without growth in this sector, it is unlikely that we could have progressed to database systems that need sophisticated ways of data storage and retrieval. Escalating Demand for Information. We have noted the increase in demand for information by organizations, not only in volume but in the types of information as well. If companies did not need more and newer types of information, there would have been no impetus for development of database systems.The earlier data systems might have been satisfactory. Inadequacy of Earlier Data Systems. Suppose the earlier data systems were able to meet the escalating demand for information. Then why bother to find better methods? But the fact is that these earlier systems were grossly inadequate to meet the information demands. Storage and management of large volumes of data were not adequate.
5
Finding and retrieving information were extremely difficult. Protecting the information asset of a company was nearly impossible with the earlier data systems. Why was this so? How were the earlier systems inadequate? In what ways could they not meet the information demands? Understanding the limitations will give you a better appreciation for database systems.
Figure 1.3: Forces behind development of Database systems 1.6 History of Database systems Before the advent of computers business organizations kept records of their businesses in manual files. With the incoming of the computer, these manual files were converted to computer files using file-oriented systems. However, the problems with file processing system, growth in computer technology, increased demand for information encouraged the development of database systems Though the initial movement toward database systems began in the 1960s, software sophistication and widespread use of database systems began in the mid-1970s. More and more organizations began to adopt database technology to manage their corporate data. Figure 1.4 provides you with a historical summary of the database industry. The figure highlights the major events and developments in the various decades. Generalized Update Access Method (GUAM) contained the first trace of the forerunner to the hierarchical database systems. In early 1960s, Rockwell developed this software to manage the data usually associated with manufacturing operations. IBM picked this up and introduced Information Management System (IMS) as a hierarchical database management system. Integrated Data Store (IDS), developed at General Electric, formed the basis for database systems on the network data model. Database Task Group (DBTG) of the Conference on Data Systems and Languages (CODASYL) began to produce standards for the network model. CODASYL, a consortium of vendors and leading businesses, serves as a group to establish standards. Among the many standards established, a major contribution by CODASYL is the set of standards for
6
COBOL, a leading programming language. In the late 1960s, vendors started to release the first generation of commercial database systems supporting the network data model. Cincoms TOTAL database management system is a primary example. The 1970s ushered in the era of relational database technology. Dr. Codds foundational paper on the relational model revolutionized the thinking on data systems. The industry quickly realized the superiority of the relational model, and more and more vendors began to adapt their products to that model. During the 1980s, the use of database systems gained a lot of ground and a large percentage of businesses made the transition from file-oriented data systems to database systems. All the three leading data modelshierarchical, network, and relationalwere quite popular, although the relational model was steadily gaining ground. Essentially the 1990s may be considered as a period of maturity of the relational model and the emergence of that data model as the leading one. Companies considered moving their data to relational databases from hierarchical and network databases. Also, vendors started to incorporate the features of both relational and object technologies in their products. Object-relational database management systems (ORDBMSs) hit the market. Now in the new millennium, the usage of database technology is spreading into newer areas. Properly designed databases serve as a chief component in data warehousing (DW), enterprise resource planning (ERP), data mining (DM), on-line analytical processing (OLAP), and customer relationship management (CRM) applications.
Figure 1.4: history of database systems
1.7 Organizations Demand for Information In globalised economy and competitive market of today, organizations are faced with increasing demand for information. This need for information is of several dimensions. Consider how billing requirements and sales analysis have changed. In the early years of computing, organizations were happy if they could bill their customers once a month and review total sales by product quarterly. Now it is completely different. Organizations must bill every sale right away to keep up the cash flow. They need up-to-date customer balance and daily and cumulative sales totals by products. What about inventory reconciliation? Earlier systems provided reports to reconcile inventory or to determine profitability only at the end of each month. Now organizations need daily inventory reconciliation to manage inventory better, daily profitability analysis to plan sales campaigns, and daily customer information to improve customer service. In the earlier period of computing, organizations were satisfied with information showing only current activity. They could use the information to manage day-to-day business and make operational decisions. In the changed business climate of globalization and fierce competition, this type of information alone is no longer adequate. Companies need information to plan and shape their future. They need information, not just to run day-to-day operations, but to make strategic decisions as well. What about the delivery of information now compared to the early days of computing? Today, online information is the norm for most companies. Fast response times and access to large volumes of data have become essential. Earlier computer systems just provided reports, mostly once a month, a few once a week, and a small number once a day. Organizations have come to realize that information is a key asset to be carefully managed and used for greater profitability. In summary, demand for information by todays enterprises contains the following attributes: More information Information newer purposes Different information types Integrated information Information to be shared Faster access to information
8
1.8 BENEFITS OF DATABASE SYSTEMS The database approach overcame the limitations of the earlier data systems and produced enormous benefits. Let us review the specific benefits and understand in what way the database approach is superior to the earlier data systems. Minimal Data Redundancy: Unlike file-oriented data systems where data are duplicated among various applications, database systems integrate all the data into one logical structure. Duplication of data is minimized. Wastage of storage space is eliminated. Going back to the bank example, with a database, customer data is not duplicated in the current account, savings account, and loan account applications. Customer data is entered and maintained in only one place in the database. However, in some instances, data duplication is permitted in a database for the purpose of access efficiency and performance improvement. However, such data duplications are kept to a minimum. Data Integrity Data integrity in a database means reduction of data inconsistency. Because of the elimination or control of data redundancy, a database is less prone to errors creeping in through data duplication. Field sizes and field formats are the same for all applications. Each application uses the same data from one place in the database. In a bank, names and addresses will be the same for current account, savings account, and loan applications. Data Integration In a database, data objects are organized into single logical data structures. For example, in file-oriented data systems, data about employees are scattered among the various applications. The payroll application contains employee name and address, social security number, salary rate, deductions, and so on. The pension plan application contains pension data about each employee, whereas the human resources application contains employee qualifications, skills, training, and education. However, all data about each employee are integrated and stored together in a database. So, in a database, data about each business object are integrated and stored separately as customer, order, product, invoice, manufacturer, sale, and so on. Data integration enables users to understand the data and the relationships among data structures easily. Programmers needing data about a business object can go to one place to get the details. For example, data about orders are consolidated in one place as order data.
Data Sharing This benefit of database systems follows from data integration. The various departments in any enterprise need to share the companys data for proper functioning.The sales department needs to share the data generated by the accounting department through the billing application. Consider the customer service department. It needs to share the data generated by several applications. The customer service application needs information about customers, their orders, billings, payments, and credit ratings. With data integration in a database, the application can get data from distinct and consolidated data structures relating to customer, orders, invoices, payments, and credit status. Data sharing is a major benefit of database systems. Each department shares the data in the database that are most pertinent to it. Departments may be interested in data structures as follows: Sales departmentCustomer/Order Accounting departmentCustomer/Order/Invoice/Payment Order processing departmentCustomer/Product/Order Inventory control departmentProduct/Order/Stock Quantity/Back Order Quantity Database technology lets each application use the portion of the database that is needed for that application. User views of the database are defined and controlled. We will have more to say about user views in later chapters. Uniform Standards We have seen that, because of the spread of duplicate data across applications in file-oriented data systems, standards cannot be enforced easily and completely. Database systems remove this difficulty. As data duplication is controlled in database systems and as data is consolidated and integrated, standards can be implemented more easily. Restrictions and business rules for a single data element need to be applied in only one place. In database systems, it is possible to eliminate problems from homonyms and synonyms. Security Controls Information is a corporate asset and, therefore, must be protected through proper security controls. In file-oriented systems, security controls cannot be established easily. Imagine the data administrator wanting to restrict and control the use of data relating to employees. In file-oriented systems, control has to be exercised in all applications having separate employee files. However, in a database system, all data about employees are consolidated, integrated, and kept in one place. Security controls
10
on employee data need to be applied in only one place in the database. Database systems make centralized security controls possible. It is also easy to apply data access authorizations at various levels of data. Data Independence Remember the lack of data independence in file-oriented systems where computer programs have data structure definitions embedded within the programs themselves. In database systems, file or data definitions are separated out of the programs and kept within the database itself. Program logic and data structure definitions are not intricately bound together. In a client/server environment, data and descriptions of data structures reside on the database server ,whereas the code for application logic executes on the client machine or on a separate application server. Reduced Program Maintenance: This benefit of database systems results primarily from data independence in applications. If the customer data structure changes by the addition of a field for cellular phone numbers, then this change made in only one place within the database itself. Only those programs that need the new field need to be modified and recompiled to make use of the added piece of data. Within limits, you can change programs or data independently. Simpler Backup and Recovery In a database system, generally all data are in one place. Therefore, it becomes easy to establish procedures to back up data. All the relationships among the data structures are also in one place. The arrangement of data in database systems makes it easier not only for backing up the data but salso for initiating procedures for recovery of data lost because of malfunctions.
POINTS TO PONDER
A DBMS contains collection of inter-related data & collection of programs to access the data. The primary goal of DBMS is to provide an environment that is both convenient & efficient for people to use in retrieving & storing information. DBMS systems are ubiquitous today & most people interact either directly or indirectly with database many times every day. Database systems are designed to store large bodies of information. A major purpose of a DBMS is to provide users with an abstract view of data i.e. the system hides how the data is stored & maintained.
11
REVIEW TERMS
Database DBMS Database System Application File System Data Inconsistency Consistency constraints Atomicity Redundancy Data isolation Data Security
STUDENTS ACTIVITY
1)What is database? Explain with example? 2)What is DBMS? Explain with example? 3)List four significant difference between file system & DBMS? 4)What are the advantages of DBMS? 5)Explain various applications of database? 6)Explain data inconsistency with example? 7)Explain data security? Why it is needed? Explain with example 8 Explain isolation & atomicity property of database? 9)Explain why redundancy should be avoided in database? 10)Explain consistency constraints in database? STUDENTS NOTES
__________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ _

________________________________________________________________________________________________________
12
Chapter 2
DATA ABSTRACTION AND DATABASE LANGUAGES
Learning objectives: After reading and studying this chapter you should be able to: Define data abstraction Explain Physical level of data abstraction Explain Logical level of data abstraction Explain View level of data abstraction Define query language Describe Data Definition Language Describe Data Manipulation Language
2.1 VIEW OF DATA A database system contains a no. of files & certain programs to access & modify these files. But the actual data is not shown to the user, the system hides actual details of how data is stored & maintained. 2.2 DATA ABSTRACTION Data abstraction is the process of distilling data down to its essentials. The data when needed should be retrieved efficiently. As all the details are not of use for all the users, so we hide the actual (complex) details from users. Various level of abstraction to data is provided which are listed below: Physical level:- It is the lowest level of abstraction & specifies how the data is actually stored. It describes the complex data structure in details. Logical level: - It is the next level of abstraction & describes what data are stored in database & what relationship exists between various data. It is less complex than physical level & specifies simple structures. Though the complexity of physical level is required at logical level, but users of logical level need not know these complexities. View level:- This level contains the actual data which is shown to the users. This is the highest level of abstraction & the user of this level need not know the actual details (complexity) of data storage.
13
2.3 Database Language

As a language is required to understand any thing, similarly to create or manipulate a database we need to learn a language. Database language is divided into mainly 2 parts :1)DDL (Data definition language) 2)DML (Data Manipulation language) Data Definition Language (DDL) Used to specify a database scheme as a set of definitions expressed in a DDL 1. DDL statements are compiled, resulting in a set of tables stored in a special file called a data dictionary or data directory. 2. The data directory contains meta data (data about data) 3. The storage structure and access methods used by the database system are specified by a set of definitions in a special type of DDL called a data storage and definition language 4. Basic idea: hide implementation details of the database schemes from the users Data Manipulation Language (DML) 1. Data Manipulation includes:

retrieval of information from the database insertion of new information into the database deletion of information in the database modification of information in the database
2. A DML is a language which enables users to access and manipulate data. The goal is to provide efficient human interaction with the system. 3. There are two types of DML: procedural: the user specifies what data is needed and how to get it nonprocedural: the user only specifies what data is needed. It easier for user and may not generate code as efficient as that produced by procedural languages 4. A query language is a portion of a DML involving information retrieval only. The terms DML and query language are often used synonymously.
14
POINTS TO PONDER DBMS systems are ubiquitous today & most people interact either directly or indirectly with database many times every day. Database systems are designed to store large bodies of information. A major purpose of a DBMS is to provide users with an abstract view of data i.e. the system hides how the data is stored & maintained. Structure of a database is defined through DDL. & manipulated through DML. DDL statements are compiled, resulting in a set of tables stored in a special file called a data dictionary or data directory. A query language is a portion of a DML involving information retrieval only. The terms DML and query language are often used synonymously.
REVIEW TERMS
Data Security Data Views Data Abstraction Physical level Logical level View level Database language DDL DML Query language
STUDENTS ACTIVITY
1) Define data abstraction? 2)How many views of data abstraction are there? Explain in details? 3)Explain database language? Differentiate between DDL & DML? STUDENTS NOTES
__________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________

15
Chapter 3
DATABASE CONCEPTS
Learning objectives: After reading and studying this chapter you should be able to: Define Data dictionary Define Meta data Explain Database schema Distinguish different types of schema: physical schema, logical schema, subschema Define Database Instance Explain the term metadata Explain Data independence Differentiate logical data independence and physical data independence
Data Repository All data in the database reside in a data repository. This is the data storage unit where physical data files are kept. The data repository contains the physical data. Mostly, it is a central place of storage for the data content. Data Dictionary The data repository contains the actual data. Let us say that you want to keep data about the customers of your company in your database. The structure of a customers data could include fields such as customer name, customer address, city, state, phone number, and so on. Data about a particular customer could be as follows in the respective fields: John Bello/1234 Lagos Street/Minna/Niger/08056342345. There are two aspects of the data about customers. One aspect is the structure of the data consisting of the field names, field sizes, data types, and so on. This part is the structure of the data for customers. The other part is the actual data for each customer consisting of the actual data values in the various fields. The first part relating to the structure resides separately in storage, and this is called the data dictionary or data catalog. A data dictionary contains the structures of the various data elements in the database. It also contains the relationships among data elements. The other part relating to the actual data about individual customers resides in the data repository. The data dictionary and the data repository work together to provide information to users. Database Software Are Oracle and Informix databases? Oracle and Informix are really the software that manages data. These are database software or database management systems (DBMS). Database software supports the storing, retrieving, and updating of data
16
in a database. Database software is not the database itself. The software helps you store, manage, and protect the data in a database. Data Abstraction Consider the example of customer data again. Data about each customer consist of several fields such as customer name, street address, city, state, phone no, credit status, and so on. We can look at customer data at three levels. The customer service representative can look at the customer from his or her point of view as consisting of only the fields that are of interest to the representative. This may be just customer name, phone number, and credit status. This is one level. The next level is the structure of the complete set of fields in customer data. This level is of interest to the database designer and application programmer. Another level is of interest to the database administrator, who is responsible for designing the physical layout for storing the data in files on disk storage. Now go through the three levels. The customer service representative is just interested in what he or she needs from customer data, not the entire set of fields or how the data is physically stored on disk storage. The complexities of the other two levels may be hidden from the customer service representative. Similarly, the physical level of how the data is stored on disk storage may be hidden from the application programmer. Only the database administrator is interested in all three levels. This concept is the abstraction of datathe ability to hide the complexities of data design at the levels where they are not required. The database approach provides for data abstraction. Data Access The database approach includes the fundamental operations that can be applied to data. Every database management system provides for the following basic operations: READ data contained in the database ADD data to the database UPDATE individual parts of the data in the database DELETE portions of the data in the database Database practitioners refer to these operations by the acronym CRUD: CCreate or add data RRead data UUpdate data DDelete data
17
Transaction Support Imagine the business function of entering an order from a customer into the computer system. The order entry clerk types in the customer number, the product code, and the quantity ordered. The order entry program reads the customer data and allows the clerk to sight verify the customer data, reads product data and displays the product description, reads inventory data, and finally updates inventory or creates a back order if inventory is insufficient. All these tasks performed by the order entry program to enter a single order comprise a single order entry transaction. When a transaction is initiated it should complete all the tasks and leave the data in the database in a consistent state. That is, if the initial stock is 1000 units and the order is for 25 units, the stock value stored in the database after the transaction is completed must be 975 units. How can this be a problem? See what can happen in the execution of the transaction. First, the transaction may not be able to perform all its tasks because of some malfunction preventing its completion. Second, numerous transactions from different order entry clerks may be simultaneously looking for inventory of the same product. Database technology enables a transaction to complete a task in its entirety or back out intermediary data updates in case of malfunctions preventing completion. Database schema The overall structure of a database is called a database schema. Database schema is usually graphical presentation of the whole database. Tables are connected with external keys and key columns. When accessing data from several tables, database schema will be needed in order to find joining data elements and in complex cases to find proper intermediate tables. Some database products use the schema to join the tables automatically. Database system has several schemas according to the level of abstraction. The physical schema describes the database design at physical level. The logical schema describes the database design at logical level. A database can also have sub-schemas (view level) that describes different views of database.
Database Instance
1. Databases change over time. 2. The information in a database at a particular point in time is called an instance of the database 3. Analogy with programming languages:
18
Meta-Data:-
Data type definition - scheme Value of a variable - instance
Meta-data is definitional data that provides information about or documentation of other data managed within an application or environment. For example, meta-data would document data about data elements or attributes, (name, size, data type, etc) and data about records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated, ownership, etc.). Meta-data may include descriptive information about the context, quality and condition, or characteristics of the data. Data Independence 1. The ability to modify a scheme definition in one level without affecting a scheme definition in a higher level is called data independence. 2. There are two kinds: Physical data independence The ability to modify the physical scheme without causing application programs to be rewritten Modifications at this level are usually to improve performance The ability to modify the conceptual scheme without causing application programs to be rewritten Usually done when logical structure of database is altered 3. Logical data independence is harder to achieve as the application programs are usually heavily dependent on the logical structure of the data. An analogy is made to abstract data types in programming languages. Logical data independence
POINTS TO PONDER
Data dictionary is a collection of data elements & its definition. Database Schema is the overall structure of a database. Database instance is the structure of a database at a particular time. Meta-data is the data about data. The ability to modify a scheme definition in one level without affecting a scheme definition in a higher level is called data independence.
19
REVIEW TERMS
Database Instance Schema Database Schema Physical schema Logical schema Physical data independence Database Language DDL DML Query Language
Data dictionary Metadata
STUDENT ACTIVITY
1)What is difference between database Schema & database instance? 2) What do you understand by the structure of a database? 3)Define physical schema and logical schema? 4)Define data independence? Explain types of data independence? 5) Define data dictionary, meta-data? 6)Define various elements of data dictionary? STUDENTS NOTES
__________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ _

________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ______________________________________________ __________________________________
20
Chapter 4
DATABASE ARCHITECTURE
Learning objectives: After reading and studying this chapter you should be able to: Differentiate between database manager and database administrator State the different types of Database user Explain Role of Database administrator Explain Roles of Database users Describe the Database architecture The database manager is a program module which provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. 1. Databases typically require lots of storage space (gigabytes). This must be stored on disks. Data is moved between disk and main memory (MM) as needed. 2. The goal of the database system is to simplify and facilitate access to data. Performance is important. Views provide simplification. 3. So the database manager module is responsible for Interaction with the file manager: Storing raw data on disk using the file system usually provided by a conventional operating system. The database manager must translate DML statements into low-level file system commands (for storing, retrieving and updating data in the database). Integrity enforcement: Checking that updates in the database do not violate consistency constraints (e.g. no bank account balance below $25) Security enforcement: Ensuring that users only have access to information they are permitted to see Backup and recovery: Detecting failures due to power failure, disk crash, software errors, etc., and restoring the database to its state before the failure Concurrency control: Preserving data consistency when there are concurrent users. 4. Some small database systems may miss some of these features, resulting in simpler database managers. (For example, no concurrency is required on a PC running MS-DOS.) These features are necessary on larger systems
4.1 DATABASE MANAGER
21
4.2 DATABASE ADMINISTRATOR The database administrator is a person having central control over data and programs accessing that data. Duties of the database administrator include: Scheme definition: the creation of the original database scheme. This involves writing a set of definitions in a DDL (data storage and definition language), compiled by the DDL compiler into a set of tables stored in the data dictionary. Storage structure and access method definition: writing a set of definitions translated by the data storage and definition language compiler Scheme and physical organization modification: writing a set of definitions used by the DDL compiler to generate modifications to appropriate internal system tables (e.g. data dictionary). This is done rarely, but sometimes the database scheme or physical organization must be modified. Granting of authorization for data access: granting different types of authorization for data access to various users Integrity constraint specification: generating integrity constraints. These are consulted by the database manager module whenever updates occur. 4.3 DATABASE USERS The database users fall into several categories: Application programmers are computer professionals interacting with the system through DML calls embedded in a program written in a host language (e.g. C, PL/1, Pascal). These programs are called application programs. The DML precompiler converts DML calls (prefaced by a special character like $, #, etc.) to normal procedure calls in a host language. The host language compiler then generates the object code. Some special types of programming languages combine Pascal-like control structures with control structures for the manipulation of a database. These are sometimes called fourth-generation languages. They often include features to help generate forms and display data. Sophisticated users interact with the system without writing programs. They form requests by writing queries in a database query language.
22
These requests are submitted to a query processor that breaks a DML statement down into instructions for the database manager module. Specialized users are sophisticated users writing special database application programs. These may be CADD systems, knowledge-based and expert systems, complex data systems (audio/video), etc. Naive users are unsophisticated users who interact with the system by using permanent application programs (e.g. automated teller machine). 4.4 DATABASE SYSTEM ARCHITECTURE Database systems are partitioned into modules for different functions as you will observe in fig. Some functions (e.g. file systems) may be provided by the operating system. Components include: File manager manages allocation of disk space and data structures used to represent information on disk. Database manager: The interface between low-level data and application programs and queries. Query processor translates statements in a query language into low-level instructions the database manager understands. (May also attempt to find an equivalent but more efficient form.) DML precompiler converts DML statements embedded in an application program to normal procedure calls in a host language. The precompiler interacts with the query processor. DDL compiler converts DDL statements to a set of tables containing metadata stored in a data dictionary. In addition, several data structures are required for physical system implementation: Data files: store the database itself. Data dictionary: stores information about the structure of the database. It is used heavily. Great emphasis should be placed on developing a good design and efficient implementation of the dictionary. Indices: provide fast access to data items holding particular values.
23
Data Dictionary
Figure 4.1: The Database Architecture
POINTS TO PONDER
Database manager is a program module which provides the interface between the low-level data stored in the database and the application programs Database administrator is a person having central control over data Database user is a person who access the database at various level. Data files: store the database itself. Data dictionary: stores information about the structure of the database. DML precompiler converts DML statements embedded in an application program to normal procedure calls in a host language.
24
File manager manages allocation of disk space and data structures used to represent information on disk.
REVIEW TERMS
Database Instance Schema Database Schema Physical schema Logical schema Database Administrator Database User
STUDENT ACTIVITY
1) What are the various kinds of database users? 2) What do you understand by the structure of a database? 3)Define physical schema and logical schema? 4)Define file manager, DML precompiler, data files? STUDENTS NOTES
________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________
25
Chapter 5
DATA MODELS
Learning objectives: After reading and studying this chapter you should be able to: Define data models Explain Different types of data models: Hierarchical data model Network data model Relational model State the property of relational tables 5.1 Data models are a collection of conceptual tools for describing data, data relationships, data semantics and data constraints. A data model is a "description" of both a container for data and a methodology for storing and retrieving data from that container. Actually, there isn't really a data model "thing". Data models are abstractions, oftentimes mathematical algorithms and concepts. You cannot really touch a data model. But nevertheless, they are very useful. The analysis and design of data models has been the cornerstone of the evolution of databases. As models have advanced so has database efficiency. There are various kinds of data models i.e. in a database records can be arranged in various ways. The various ways in which data can be represented are:1) Hierarchical data model 2) Network data model 3) Relational Model 4) E-R-Model 5.2 The Hierarchical Model Organization of the records is as a collection of trees. As its name implies, the Hierarchical Database Model defines hierarchically-arranged data. Perhaps the most intuitive way to visualize this type of relationship is by visualizing an upside down tree of data. In this tree, a single table acts as the "root" of the database from which other tables "branch" out. You will be instantly familiar with this relationship because that is how all windowsbased directory management systems (like Windows Explorer) work these days.
26
Relationships in such a system are thought of in terms of children and parents such that a child may only have one parent but a parent can have multiple children. Parents and children are tied together by links called "pointers" (perhaps physical addresses inside the file system). A parent will have a list of pointers to each of their children. If we want to create a structure where in a course various students are there & these students are given certain marks in assignment. However, as you can imagine, the hierarchical database model has some serious problems. For one, you cannot add a record to a child table until it has already been incorporated into the parent table. This might be troublesome if, for example, you wanted to add a student who had not yet signed up for any courses. Worse, yet, the hierarchical database model still creates repetition of data within the database. You might imagine that in the database system shown above, there may be a higher level that includes multiple course. In this case, there could be redundancy because students would be enrolled in several courses and thus each "course tree" would have redundant student information. Redundancy would occur because hierarchical databases handle one-to-many relationships well but do not handle many-to-many relationships well. This is because a child may only have one parent. However, in many cases you will want to have the child be related to more than one parent. For instance, the relationship between student and class is a "many-to-many". Not only can a student take many subjects but a subject may also be taken by many students. How would you model this relationship simply and efficiently using a hierarchical database? The answer is that you wouldn't. Though this problem can be solved with multiple databases creating logical links between children, the fix is very kludgy and awkward. Faced with these serious problems, the computer brains of the world got together and came up with the network model. 5.3 Network Databases In many ways, the Network Database model was designed to solve some of the more serious problems with the Hierarchical Database Model. Specifically, the Network model solves the problem of data redundancy by representing relationships in terms of sets rather than hierarchy. The model had its origins in the Conference on Data Systems
27
Languages (CODASYL) which had created the Data Base Task Group to explore and design a method to replace the hierarchical model. The network model is very similar to the hierarchical model actually. In fact, the hierarchical model is a subset of the network model. However, instead of using a singleparent tree hierarchy, the network model uses set theory to provide a tree-like hierarchy with the exception that child tables were allowed to have more than one parent. This allowed the network model to support many-to-many relationships. Visually, a Network Database looks like a hierarchical Database in that you can see it as a type of tree. However, in the case of a Network Database, the look is more like several trees which share branches. Thus, children can have multiple parents and parents can have multiple children. Nevertheless, though it was a dramatic improvement, the network model was far from perfect. Most profoundly, the model was difficult to implement and maintain. Most implementations of the network model were used by computer programmers rather than real users. What was needed was a simple model which could be used by real end users to solve real problems. 5.4 Relational Model The relational model was formally introduced by Dr. E. F. Codd in 1970 and has evolved since then, through a series of writings. The model provides a simple, yet rigorously defined, concept of how users perceive data. Network model solves the problem of data redundancy by representing relationships in terms of sets. A relational database is a collection of two-dimensional tables. The organization of data into relational tables is known as the logical view of the database. That is, the form in which a relational database presents data to the user and the programmer. The way the database software physically stores the data on a computer disk system is called the internal view. The internal view differs from product to product and does not concern us here. A relational database allows the definition of data structures, storage and retrieval operations and integrity constraints. In such a database the data and relations between them are organised in tables. A table is a collection of records and each record in a table contains the same fields.
28
Properties of Relational Tables: Values Are Atomic Each Row is Unique Column Values Are of the Same Kind The Sequence of Columns is Insignificant The Sequence of Rows is Insignificant Each Column Has a Unique Name Certain fields may be designated as keys, which means that searches for specific values of that field will use indexing to speed them up. Where fields in two different tables take values from the same set, a join operation can be performed to select related records in the two tables by matching values in those fields. Often, but not always, the fields will have the same name in both tables. For example, an "orders" table might contain (customer-ID, product-code) pairs and a "products" table might contain (product-code, price) pairs so to calculate a given customer's bill you would sum the prices of all products ordered by that customer by joining on the productcode fields of the two tables. This can be extended to joining multiple tables on multiple fields. Because these relationships are only specified at retreival time, relational databases are classed as dynamic database management system. The RELATIONAL database model is based on the Relational Algebra. A basic understanding of the relational model is necessary to effectively use relational database software such as Oracle, Microsoft SQL Server, or even personal database systems such as Access or Fox, which are based on the relational model. POINTS TO PONDER Data models are a collection of conceptual tools for describing data, data relationships, data semantics and data constraints. Types of data models are:1. Hierarchial data model 2. Network data model 3. Relational Model 4. E-R-Model The Hierarchical Database Model defines hierarchically-arranged data. Network model solves the problem of data redundancy by representing relationships in terms of sets.
29
The relational model is the most popular model in use REVIEW TERMS Data models Hierarchical data model Network data model Relational data model STUDENTS ACTIVITY 1)Define data models? 2)Define hierarchical data model? 3)Define network data model? 4)Define relational data model? 5) State the properties of relational table STUDENTS NOTES
________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ _____________________________________________________________________________________________________
30
Chapter 6
RELATIONAL DATABASE MANAGEMENT SYSTEM
Learning objectives: After reading and studying this chapter you should be able to:
Understand RDBMS Understand data structures Understand data manipulation Understand various relational algebra operations Understand data integrity
The relational model was proposed by E. F. Codd in 1970. It deals with database management from an abstract point of view. The model provides specifications of an abstract database management system. To use the database management systems based on the relational model however, users do not need to master the theoretical foundations. Codd defined the model as consisting of the following three components: 1. Data Structure - a collection of data structure types for building the database. 2. Data Manipulation - a collection of operators that may be used to retrieve, derive or modify data stored in the data structures. 3. Data Integrity - a collection of rules that implicitly or explicitly define a consistent database state or changes of states
Data Structure Often the information that an organisation wishes to store in a computer and process is complex and unstructured. For example, we may know that a department in a university has 200 students, most are full-time with an average age of 22 years, and most are females. Since natural language is not a good language for machine processing, the information must be structured for efficient processing. In the relational model the information is structures in a very simple way We consider the following database to illustrate the basic concepts of the relational data model.
31
The above database could be mapped into the following relational schema which consists of three relation schemes. Each relation scheme presents the structure of a relation by specifying its name and the names of its attributes enclosed in parenthesis. Often the primary key of a relation is marked by underlining: student(student_id, student_name, address) enrolment(student_id, subject_id) subject(subject_id, subject_name, department) An example of a database based on the above relational model is: Student_id 8656789 8700074 8900020 8801234 8654321 8712374 8612345 Student_name Peter Bello Esther Maidawa Mohammed Kabiru Ngozi Kanu Panam Bitrus Kolade Okeowo Uche Agu The relation student Student_id 8700074 8900020 8900020 8700074 8801234 8801234 Subject_id CP302 CP302 CP304 MA111 CP302 CH001 Address Kubwa, Abuja. Tunga Magajiya. Niger Ungwar Makera. Minna Sabo Yaba, Lagos Makurdi, Benue Abeokuta, Ogun Night Mile, Enugu.
The relation enrolment
32
Subject_id CP302 CP304 CH001 PH101 MA111
Subject_name Database Management Software Engineering Introduction to Chemistry Physics Pure Mathematics The Relation Subject
Department Comp. Science Comp. Science Chemistry Physics Mathematics
We list a number of properties of relations: 1. Each relation contains only one record type. 2. Each relation has a fixed number of columns that are explicitly named. Each attribute name within a relation is unique. 3. No two rows in a relation are the same. 4. Each item or element in the relation is atomic, that is, in each row, every attribute has only one value that cannot be decomposed and therefore no repeating groups are allowed. 5. Rows have no ordering associated with them. 6. Columns have no ordering associated with them (although most commercially available systems do). The above properties are simple and based on practical considerations. The first property ensures that only one type of information is stored in each relation. The second property involves naming each column uniquely. This has several benefits. The names can be chosen to convey what each column is and the names enable one to distinguish between the column and its domain. Furthermore, the names are much easier to remember than the position of the position of each column if the number of columns is large. The third property of not having duplicate rows appears obvious but is not always accepted by all users and designers of DBMS. The property is essential since no sensible context free meaning can be assigned to a number of rows that are exactly the same.
33
The next property requires that each element in each relation be atomic that cannot be decomposed into smaller pieces. In the relation model, the only composite or compound type (data that can be decomposed into smaller pieces) is a relation. This simplicity of structure leads to relatively simple query and manipulative languages. The relation is a set of tuples and is closely related to the concept of relation in mathematics. Each row in a relation may be viewed as an assertion. For example, the relation student asserts that a student by the name of Panam Bitrus has student_id 8654321 and lives at Makurdi, Benue. Similarly the relation subject asserts that one of the subjects offered by the Department of Computer Science is CP302 Database Management. In the relational model, a relation is the only compound data structure since relation do not allow repeating groups or pointers. We now define the relational terminology: Relation - essentially a table Tuple - a row in the relation Attribute - a column in the relation Degree of a relation - number of attributes in the relation Cardinality of a relation - number of tuples in the relation Domain - a set of values that an attribute is permitted to take. Same domain may be used by a number of different attributes. Primary key - as discussed in the last chapter, each relation must have an attribute (or a set of attributes) that uniquely identifies each tuple. Each such attribute (or a set of attributes) is called a candidate key of the relation if it satisfies the following properties: (a) the attribute or the set of attributes uniquely identifies each tuple in the relation (called uniqueness), and (b) if the key is a set of attributes then no subset of these attributes has property (a) (called minimality).
34
There may be several distinct set of attributes that may serve as candidate keys. One of the candidate keys is arbitrarily chosen as the primary key of the relation. The three relations above student, enrolment and subject have degree 3, 2 and 3 respectively and cardinality 4, 6 and 5 respectively. The primary key of the the relation student is student_id, of relation enrolment is (student_id, subject_id), and finally the primary key of relation subject is subject_id. The relation student probably has another candidate key. If we can assume the names to be unique than the student_name is a candidate key. If the names are not unique but the names and address together are unique, then the two attributes (student_id, address) is a candidate key. Note that both student_id and (student_id, address) cannot be candidate keys, only one can. Similarly, for the relation subject, subject_name would be a candidate key if the subject names are unique. The relational model is the most popular data model for commercial data processing applications. It is very much simple due to which the programmers work is reduced.
Data Manipulation The manipulative part of relational model makes set processing (or relational processing) facilities available to the user. Since relational operators are able to manipulate relations, the user does not need to use loops in the application programs. Avoiding loops can result in significant increase in the productivity of application programmers. The primary purpose of a database in an enterprise is to be able to provide information to the various users in the enterprise. The process of querying a relational database is in essence a way of manipulating the relations that are the database. For example, one may wish to know 1. names of all students enrolled in CP302, or 2. names of all subjects taken by Esther Maidawa.
35
The Relational Algebra The relational algebra is a procedural query language. It consists of a set of operations that take one or two relations as input and produce a new relation as their result. The fundamental operations in the relational algebra are select, project, union, set difference, Cartesian product, and rename. In addition to the fundamental operations, there are several other operationsnamely, set intersection, natural join, division, and assignment. We will define these operations in terms of the fundamental operations.
Fundamental Operations The select, project, and rename operations are called unary operations, because they operate on one relation. The other three operations operate on pairs of relations and are, therefore, called binary operations. Various operations are shown as follows:
36
The Select Operation The select operation selects tuples that satisfy a given condition. The argument relation is in parentheses after the . Thus, to select those tuples of the loan relation where the branch is Zungeru we write branch-name = Zungeru (loan) If the loan relation is as shown, then the relation that results from the preceding query will be a different relation. We can find all tuples in which the amount lent is more than N1200000 writing amount>1200000 (loan) In general, we allow comparisons using =, , <, , >, in the selection predicate. Furthermore, we can combine several predicates into a larger predicate by using the connectives and (), or (V), and not (). Thus, to find those tuples pertaining to loans of more than N1,200,000 made by the Zungeru branch (of say Unity Bank), we write branch-name = Zungeru amount>1200000 (loan) loan-number L-15 L-16 branch-name Zungeru Zungeru Amount 1500000 1300000
The selection predicate may include comparisons between two attributes. To Illustrate, consider the relation loan-officer that consists of three attributes: customer-name, banker-name, and loan-number, which specifies that a particular banker is the loan officer for a loan that belongs to some customer. To find all customers who have the same name as their loan officer, we can write customer-name = banker-name (loan-officer)
37
Projection Projection is the operation of selecting certain attributes from a relation R to form a new relation S. For example, one may only be interested in the list of names from a relation that has a number of other attributes. Projection operator may then be used. Like selection, projection is a unary operator. loan-number, amount (loan) loan-number L-11 L-14 L-15 L-16 L-17 L-23 L-93 amount 900000 1500000 1500000 1300000 1000000 2000000 500000
Composition of Relational Operations The fact that the result of a relational operation is itself a relation is important. Consider the more complicated query Find those customers who live in Minna. We write: customer-name ( customer-city = Minna (customer)) Notice that, instead of giving the name of a relation as the argument of the projection operation, we give an expression that evaluates to a relation. In general, since the result of a relational-algebra operation is of the same type (relation) as its inputs, relational-algebra operations can be composed together into a relational-algebra expression. Composing relational-algebra operations into relational-algebra expressions is just like composing arithmetic operations (such as +, -, *, and %) into arithmetic expressions.
38
Cartesian product The cartesian product of two tables combines each row in one table with each row in the other table. Example: The table E (for EMPLOYEE) ENR 121 111 131 ENAME John Bello Esther Danmallam Tunde Gana DEPT A B C
Example: The table D (for DEPARTMENT) DNUMBER A B C EXD ENR 121 121 121 111 111 111 131 131 131 ENAME John Bello John Bello John Bello Esther Danmallam Esther Danmallam Esther Danmallam Tunde Gana Tunde Gana Tunde Gana DEPT A A A B B B C C C DNUMBER A B C A B C A B C DNAME Sales Marketing Legal Sales Marketing Legal Sales Marketing Legal EXD RELATIONAL ALGEBRA DNAME Sales Marketing Legal
Seldom useful in practice. Can give a huge result.
39
The Union Operation Consider a query to find the names of all bank customers who have either an account or a loan or both. Note that the customer relation does not contain the information, since a customer does not need to have either an account or a loan at the bank. To answer this query, we need the information in the depositor relation and in the borrower relation. We know how to find the names of all customers with a loan in the bank: customer-name (borrower) We also know how to find the names of all customers with an account in the bank: customer-name (depositor) To answer the query, we need the union of these two sets; that is, we need all customer names that appear in either or both of the two relations. We find these data by the binary operation union, denoted, as in set theory, by U. So the expression needed is customer-name (borrower) U customer-name (depositor)
For a union operation r U s to be valid, we require that two conditions hold: 1. The relations r and s must have the same number of attributes. 2. The domains of the ith attribute of r and the ith attribute of s must be the same, for all i. Note that r and s can be, in general, temporary relations that are the result of relational-algebra expressions. The Set-Intersection Operation The first additional-relational algebra operation that we shall define is set intersection (). Suppose that we wish to find all customers who have both a loan and an account. Using set intersection, we can write customer-name (borrower) customer-name (depositor)
40
Note that we can rewrite any relational algebra expression that uses set intersection by replacing the intersection operation with a pair of setdifference operations as: r s = r (r s) Thus, set intersection is not a fundamental operation and does not add any power to the relational algebra. It is simply more convenient to write r s than to write r (r s). The Set Difference Operation The set-difference operation, denoted by -, allows us to find tuples that are in one relation but are not in another. The expression r s produces a relation containing those tuples in r but not in s. We can find all customers of the bank who have an account but not a loan by writing customer-name (depositor) - customer-name (borrower)
As with the union operation, we must ensure that set differences are taken between compatible relations. Therefore, for a set difference operation r s to be valid, we require that the relations r and s be of the same arity, and that the domains of the ith attribute of r and the ith attribute of s be the same. The Assignment Operation It is convenient at times to write a relational-algebra expression by assigning parts of it to temporary relation variables. The assignment operation, denoted by , works like assignment in a programming language. temp1 amount>1200000 (loan) temp2 loan-number, amount (loan) result = temp1 temp2 The evaluation of an assignment does not result in any relation being displayed to the user. Rather, the result of the expression to the right of the
41
is assigned to the relation variable on the left of the . This relation variable may be used in subsequent expressions. With the assignment operation, a query can be written as a sequential program consisting of a series of assignment followed by an expression whose value is displayed as the result of the query. For relational-algebra queries, assignment must always be made to a temporary relation variable.Note that the assignment operation does not provide any additional power to the algebra. It is, however, a convenient way to express complex queries. POINTS TO PONDER The relational model was proposed by E. F. Codd in 1970. provides specifications of an abstract database management system It consists of the following three components: 1.Data Structure a collection of data structure types for building the database. 2.Data Manipulation a collection of operators that may be used to retrieve, derive or modify data stored in the data structures. 3.Data Integrity a collection of rules that implicitly or explicitly define a consistent database state or changes of. Relational algebra describes a set of algebraic operations that operates on tables, & output a table as a result. REVIEW TERMS Table/Relation Tuple Domain Database schema Database instance Keys
42
Primary key Foreign key Relational algebra STUDENT ACTIVITY 1) Why do we use RDBMS? 2) Define relation, tuple, domain, keys? 3) What is the difference between Intersection, Union & Cartesian product? RESEARCH THE FOLLOWING TERMS AND ANSWER QUESTIONS THAT FOLLOW: Aggregate functions Joins Natural join Outer join Right outer join Left outer join Rename operation 1) Define aggregate functions with example? 2)Define joins? What is natural join? 3)Differentiate between inner join & outer join? 4)Differentiate between left outer join & right outer join with the help of example? 5)Define rename operators? STUDENTS NOTES __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ ______________________________________________________________________
43
Chapter 7
ENTITY RELATIONSHIP MODEL
Learning objectives: After reading and studying this chapter you should be able to:
Understand entity Understand relationship Understand attribute, domain, entity set Understand Simple & composite Attributes Understand Derived Attribute Understand relationship set Know Components of E-R-Diagrams Design E-R-diagrams ER considers the real world to consist of entities and relationships among them. An ENTITY is a `thing' which can be distinctly identified, for example a person, a car, a subroutine, a wire, an event. A RELATIONSHIP is an association among entities, eg person OWNS car is an association between a person and a car. person EATS dish IN place is an association among a person, a dish and a place. Attribute, Value, Domain, Entity Set The information about one entity is expressed by a set of (attribute, value) pairs, eg a car model could be: Name = R1222 Power = 7.3 Nseats = 5 Values of attributes belong to different value-sets or domains, for example, for a car, Nseats is an integer between 1 and 12
44
Entities defined by the same set of attributes can be grouped into an ENTITY SET (abbreviated as ESet) as shown in ESET: CarModel Name R1222 HZ893 R1293 Power 7.3 6.8 5.4 Nseats 5 5 4
AN ENTITY SET A given set of attributes may be referred to as an entity type. All entities in a given ESet are of the same type, but sometimes there can be more than one set of the same type. The set of all persons who are customers at a given bank can be defined as an entity set customer. The individual entity that constitutes a set are said to be extension to entity set. So all the individual bank customers are the extension of entity set customer. Each entity has a value for each of its attributes. For each attribute, there is a set of permitted values called domain or value set. SIMPLE & COMPOSITE ATTRIBUTES A simple attribute has got one value for its attribute & a composite attribute is one which can be divided into sub-parts. For example an attribute name can be divided into first name middle name & last name . SINGLE & MULTIVALUED ATTRIBUTES An attribute which have got only one value is known as single valued attribute. For ex. the loan_no attribute will have only one loan_no. There may be cases when an attribute has a set of values for a specific entity. For ex. an attribute phone_no. may have a value zero, one or several phone_no. This is known as multivalued attribute.
45
DERIVED ATTRIBUTE Its value is derived from value of other related attributes or entities. For ex. an attribute age can be calculated from another attribute date_of_birth. Relationship, Relationship Set A relationship is an association among several entities. For ex. a customer A is associated with loan_no L1. A relationship set is a subset of the cartesian product of entity sets. For example a relationship set (abbreviated as RSet) on the relationship `Person HAS_EATEN Dish IN Place' could be as shown in RSet 'Person HAS_EATEN Dish IN Place' Person Steven Bello Aishat Mike Dish Amala Dokunu Noodles Tuwo Place Ibadan Accra Naples Kano
Notice that an RSet is an ESet having ESets as attributes. Components of ER-D The overall logical structure of a database can be expressed graphically by an E-R diagram.The various components of an E-R diagram are as folows:1. Rectapgles, which represent entity sets. 2. Ellipses, which represent attributes. 3. Diamonds, which represents relationship sets 4. Lines, which link attributes to entity sets and entity sets to relationship sets. 5. Double ellipses, which represent multi-valued attributes. 6 Dashed ellipse which represents derived attributes 7. Double lines,which indicate total participation of an entity in a relationship set.
46
ENTITY RELATIONSHIP DIAGRAM NOTATIONS Peter Chen developed ERDs in 1976. Since then Charles Bachman and James Martin have added some sligh refinements to the basic ERD principles. Entity An entity is an object or concept about which you want to store information. Weak Entity A weak entity is dependent on another entity to exist.. Attributes Attributes are the properties or characteristics of an entity. Key attribute A key attribute is the unique, distinguishing characteristic of the entity. For example, an employee's social security number might be the employee's key attribute. Multivalued attribute A multivalued attribute can have more than one value. For example, an employee entity can have multiple skill values. Derived attribute A derived attribute is based on another attribute. For example, an employee's monthly salary is based on the employee's annual salary. Relationships Relationships illustrate how two entities share information in the database structure. Weak relationship To connect a weak entity with others, you should use a weak relationship notation. Cardinality Cardinality specifies how many instances of an entity relate to one instance of another entity. Ordinality is also closely linked to cardinality. While cardinality specifies the occurences of a relationship, ordinality describes the relationship as either mandatory or optional. In other words, cardinality specifies the maximum number of relationships and ordinality specifies the absolute minimum number of relationships. Recursive relationship In some cases, entities can be self-linked. For example, employees can supervise other employees.
47
Mapping Cardinalities It express the number of entities to which other entity can be associated via a relationship set. It can be of the following types:1. One to one: An entity in A is associated with at most one entity in B, & an entity in B is associated with at most one entity in A. 2. One to many: An entity in A is associated with any number(zero or more) of entities in B. An entity in B however can be associated with at most one entity in A. 3. Many to one: An entity in A is associated with at most one entity in B.An entity in B however can be associated with any number(zero or more) of entities in A. 4. Many to many: An entity in A is associated with any number(zero or more) of entities in B & an entity in B is associated with any number(zero or more) of entities in A. One to many relationship Many to one relationship One to one relationship POINTS TO PONDER An ENTITY is a `thing' which can be distinctly identified, for example a person, a car, a subroutine, a wire, an event. A RELATIONSHIP is an association among entities, eg person OWNS car A given set of attributes may be referred to as an entity type A simple attribute has got one value for its attribute & a composite attribute is one which can be divided into sub-parts. value is derived from value of other related attributes or entities is known as derived attribute. A relationship is an association among several entities A relationship set is a subset of the cartesian product of entity sets REVIEW TERMS
48
Entity Entity set Attribute Domain Value Relationship Relationship set Cardinality Association STUDENT ACTIVITY 1)Define entity, domain,value? 2)Define relationship, relationship set? 3)Differentiate between simple & composit attribute? 4)Define derived attribute? 5)Differentiate between single & multi-valued attribute? 6)Define cardinality?Explain various kinds of cardinality? 7)Define various components of E-R-Diagram?
STUDENTS NOTES
________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________ ________________________________________________________________________________________________________
49

Com 312 Lecture Notes2010

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Com 312 Lecture Notes2010

Загружено:

Авторское право:

Доступные форматы

Chapter 1

1.1 What is a Database?

Actual data storage

1.3 Database System Applications

Figure 1.4: history of database systems

_

2.3 Database Language

Data type definition - scheme Value of a variable - instance

Data dictionary Metadata

_

4.1 DATABASE MANAGER

Figure 4.1: The Database Architecture

The relation enrolment

Subject_id CP302 CP304 CH001 PH101 MA111

Department Comp. Science Comp. Science Chemistry Physics Mathematics

Seldom useful in practice. Can give a huge result.

Вам также может понравиться