Академический Документы
Профессиональный Документы
Культура Документы
What are the Drawbacks of traditional file systems for large data storage?
File processing system is good when there is only limited number of files and data in are very less. As the data and files in
the system grow, handling them becomes difficult.
Data Mapping and Access: - Although all the related informations are grouped and stored in different files, there
is no mapping between any two files. i.e.; any two dependent files are not linked. Even though Student files and
Student_Report files are related, they are two different files and they are not linked by any means. Hence if we
need to display student details along with his report, we cannot directly pick from those two files. We have to write
a lengthy program to search Student file first, get all details, then go Student_Report file and search for his report.
When there is very huge amount of data, it is always a time consuming task to search for particular information
from the file system. It is always an inefficient method to search for the data.
Data Redundancy: - There are no methods to validate the insertion of duplicate data in file system. Any user can
enter any data. File system does not validate for the kind of data being entered nor does it validate for previous
existence of the same data in the same file. Duplicate data in the system is not appreciated as it is a waste of space,
and always lead to confusion and mishandling of data. When there are duplicate data in the file, and if we need to
update or delete the record, we might end up in updating/deleting one of the record, leaving the other record in the
file. Again the file system does not validate this process. Hence the purpose of storing the data is lost.
Though the file name says Student file, there is a chance of entering staff information or his report information in
the file. File system allows any information to be entered into any file. It does not isolate the data being entered
from the group it belongs to.
Data Dependence: - In the files, data are stored in specific format, say tab, comma or semicolon. If the format of
any of the file is changed, then the program for processing this file needs to be changed. But there would be many
programs dependent on this file. We need to know in advance all the programs which are using this file and change
in the entire place. Missing to change in any one place will fail whole application. Similarly, changes in storage
structure, or accessing the data, affect all the places where this file is being used. We have to change it entire
programs. That is smallest change in the file affect all the programs and need changes in all them.
Data inconsistency: - Imagine Student and Student_Report files have student’s address in it, and there was a
change request for one particular student’s address. The program searched only Student file for the address and it
updated it correctly. There is another program which prints the student’s report and mails it to the address
mentioned in the Student_Report file. What happens to the report of a student whose address is being changed?
There is a mismatch in the actual address and his report is sent to his old address. This mismatch in different copies
of same data is called data inconsistency. This has occurred here, because there is no proper listing of files which
has same copies of data.
Data Isolation: - Imagine we have to generate a single report of student, who is studying in particular class, his
study report, his library book details, and hostel information. All these informations are stored in different files.
How do we get all these details in one report? We have to write a program. But before writing the program, the
programmer should find out which all files have the information needed, what is the format of each file, how to
search data in each file etc. Once all these analysis is done, he writes a program. If there is 2-3 files involved,
programming would be bit simple. Imagine if there is lot many files involved in it? It would be require lot of effort
from the programmer. Since all the datas are isolated from each other in different files, programming becomes
difficult.
Security: - Each file can be password protected. But what if have to give access to only few records in the file? For
example, user has to be given access to view only their bank account information in the file. This is very difficult in
the file system.
Integrity: - If we need to check for certain insertion criteria while entering the data into file it is not possible
directly. We can do it writing programs. Say, if we have to restrict the students above age 18, then it is by means of
program alone. There is no direct checking facility in the file system. Hence these kinds of integrity checks are not
easy in file system.
Atomicity: - If there is any failure to insert, update or delete in the file system, there is no mechanism to switch
back to the previous state. Imagine marks for one particular subject needs to be entered into the Report file and then
total needs to be calculated. But after entering the new marks, file is closed without saving. That means, whole of
the required transaction is not performed. Only the totaling of marks has been done, but addition of marks not being
done. The total mark calculated is wrong in this case. Atomicity refers to completion of whole transaction or not
completing it at all. Partial completion of any transaction leads to incorrect data in the system. File system does not
guarantee the atomicity. It may be possible with complex programs, but introduce for each of transaction costs
money.
Concurrent Access: - Accessing the same data from the same file is called concurrent access. In the file system,
concurrent access leads to incorrect data. For example, a student wants to borrow a book from the library. He
searches for the book in the library file and sees that only one copy is available. At the same time another student
also, wants to borrow same book and checks that one copy available. First student opt for borrow and gets the book.
But it is still not updated to zero copy in the file and the second student also opt for borrow! But there are no books
available. This is the problem of concurrent access in the file system.
database is full of datas and records. What we see in rows and columns is quite different when it reaches the memory. What
we see is the actual data. But when they are stored in the memory like disks or tapes, they are stored in the form of bits. But
any users will not understand these bits. He needs to see the actual data to understand. But all the details about the data
stored in the memory are not necessary for the users. He needs only little information that he is interested or wants to work
with. Masking the unwanted data from the users happens at different levels in the database. This masking of data is called
data abstraction. There are 4 levels of data abstraction.
External level - This is the highest level in data abstraction. At this level users see the data in the form of rows and
columns. This level illustrates the users how the data is stored in terms of tables and relations. Users view full or
partial data based on the business requirement. The users will have different views here, based on their levels of
access rights. For example, student will not have access to see Lecturers salary details, one employee will not have
access to see other employees details, unless he is a manager. At this level, one can access the data from database
and perform some calculations based on the data. For example calculate the tax from the salary of employee,
calculate CGPA of a Student, Calculate age of a person from his Date of Birth etc. These users can be real users or
any programs.
Any changes/ computations done at this level will not affect other levels of data. That means, if we retrieve the few
columns of the STUDENT table, it will not change the whole table, or if we calculate the CGPA of a Student, it
will not change/update the table. This level of data is based on the below levels, but it will not alter the data at
below levels.
Logical/ Conceptual level - This is the next level of abstraction. It describes the actual data stored in the database
in the form of tables and relates them by means of mapping. This level will not have any information on what a
user views at external level. This level will have all the data in the database.
Any changes done in this level will not affect the external or physical levels of data. That is any changes to the
table structure or the relation will not modify the data that the user is viewing at the external view or the storage at
the physical level. For example, suppose we have added a new column ‘skills’ which will not modify the external
view data on which the user was viewing Ages of the students. Similarly, it will have space allocated for ‘Skills’ in
the physical memory, but it will not modify the space or address of Date of Birth (using which Age will be derived)
in the memory. Hence external and physical independence is achieved.
Internal level - This is one of the intermediary levels. In most of the cases this level is not mentioned and usually it
is said that we have 3 levels of data abstraction. This level depends on the DBMS software. This level is how the
database is seen from DBMS. We can even combine logical level and this level.
Physical level - This is the lowest level in data abstraction. This level describes how the data is actually stored in
the physical memory like magnetic tapes, hard disks etc. In this level the file organization methods like hashing,
sequential, B+ tree comes into picture. At this level, developer would know the requirement, size and accessing
frequency of the records clearly. So designing this level will not be much complex for him.
Instances in DBMS
In simple words, it is the snapshot of the database taken at a particular moment. It can also be described in
more significant way as the collection of the information stored in the database at that particular moment.
Instance can also be called as the database state or current set of occurrence due the fact that it is information
that is present at the current state.
Every time we update the state say we insert, delete or modify the value of the data item in the record, it
changes from one state to other. At the given time, each schema has its own set of instances.
Lets take an example to understand in a much better way,
An organization with an employees database will have three different instances such as production that is
used to monitor the data right at that moment, per-production that is used to test new functionality prior to
release of production and the development that is used by database developers to create new functionality.
Schema in DBMS
It is the overall description or the overall design of the database specified during the database design.
Important thing to be remembered here is it should not be changed frequently. Basically, it displays the
record types(entity),names of data items(attribute) but not the relation among the files.
Interesting point is the values in schema might change but not the structure of schema.
To understand it well, Schema can be assumed as a framework where in the values of data items are to be
fitted, these values can be changed but not frame/format of the schema.
Consider the below two examples of schema for database stores and discounts
STORES
The former example shows the schema for stores displaying the name of the store, store id,address,city and
state in which it is located and the zip code of respective location.
The latter example is all about schema of discounts that clearly shows the type,id and quality,thus we can
now relate to the fact that schema only displays the record types (entities) and names of data
items(attributes) but does not show the relation among the files.
Schema can be partitioned as logical schema and physical schema.
Look at the below diagram
Here,former part shows the logical schema which is concerned with the data structure with exploring data
structure offered to DBMS so that schema is very easy for the computer to understand.
The latter part that is the physical schema is concerned with the way or the manner in which conceptual
database gets represented in the computer as it is stored in the database.Physical schema is hidden behind the
logical schema and thus can be be modified without affecting the application programs
Database management system provides data definition language(DDL) and document schema definition
language(DSDL) to specify both logical and physical schema.
Quickly we can summarize the above things, information/data in database at particular moment is known as
instance,physical arrangement of data as it appears in database can be defined as schema, and the logical
view of data as it appears to the application can be called as sub schema.
Rules
Rule 0: The foundation rule:
For any system that is advertised as, or claimed to be, a relational data base management system, that system must be
able to manage data bases entirely through its relational capabilities.
Rule 1: The information rule:
All information in a relational data base is represented explicitly at the logical level and in exactly one way – by values
in tables.
Rule 2: The guaranteed access rule:
Each and every datum (atomic value) in a relational data base is guaranteed to be logically accessible by resorting to a
combination of table name, primary key value and column name.
Rule 3: Systematic treatment of null values:
Null values (distinct from the empty character string or a string of blank characters and distinct from zero or any other
number) are supported in fully relational DBMS for representing missing information and inapplicable information in a
systematic way, independent of data type.
Rule 4: Dynamic online catalog based on the relational model:
The data base description is represented at the logical level in the same way as ordinary data, so that authorized users
can apply the same relational language to its interrogation as they apply to the regular data.
Rule 5: The comprehensive data sublanguage rule:
A relational system may support several languages and various modes of terminal use (for example, the fill-in-the-
blanks mode). However, there must be at least one language whose statements are expressible, per some well-defined
syntax, as character strings and that is comprehensive in supporting all of the following items:
1. Data definition.
2. View definition.
3. Data manipulation (interactive and by program).
4. Integrity constraints.
5. Authorization.
6. Transaction boundaries (begin, commit and rollback).
Rule 6: The view updating rule:
All views that are theoretically updatable are also updatable by the system.
Rule 7: Possible for high-level insert, update, and delete:
The capability of handling a base relation or a derived relation as a single operand applies not only to the retrieval of
data but also to the insertion, update and deletion of data.
Rule 8: Physical data independence:
Application programs and terminal activities remain logically unimpaired whenever any changes are made in either
storage representations or access methods.
Rule 9: Logical data independence:
Application programs and terminal activities remain logically unimpaired when information-preserving changes of any
kind that theoretically permit unimpairment are made to the base tables.
Rule 10: Integrity independence:
Integrity constraints specific to a particular relational data base must be definable in the relational data sublanguage and
storable in the catalog, not in the application programs.
Rule 11: Distribution independence:
The end-user must not be able to see that the data is distributed over various locations. Users should always get the
impression that the data is located at one site only.
Rule 12: The non subversion rule:
If a relational system has a low-level (single-record-at-a-time) language, that low level cannot be used to subvert or
bypass the integrity rules and constraints expressed in the higher level relational language (multiple-records-at-a-time).
Atomicity
By this, we mean that either the entire transaction takes place at once or doesn’t happen at all. There is no midway i.e. transactions
do not occur partially. Each transaction is considered as one unit and either runs to completion or is not executed at all. It involves
following two operations.
Consider the following transaction T consisting of T1 and T2: Transfer of 100 from account X to account Y.
If the transaction fails after completion of T1 but before completion of T2.( say, after write(X) but before write(Y)), then amount
has been deducted from X but not added to Y. This results in an inconsistent database state. Therefore, the transaction must be
executed in entirety in order to ensure correctness of database state.
Consistency
This means that integrity constraints must be maintained so that the database is consistent before and after the transaction. It refers
to correctness of a database. Referring to the example above,
The total amount before and after the transaction must be maintained.
Therefore, database is consistent. Inconsistency occurs in case T1 completes but T2 fails. As a result T is incomplete.
Isolation
This property ensures that multiple transactions can occur concurrently without leading to inconsistency of database state.
Transactions occur independently without interference. Changes occurring in a particular transaction will not be visible to any
other transaction until that particular change in that transaction is written to memory or has been committed. This property ensures
that the execution of transactions concurrently will result in a state that is equivalent to a state achieved these were executed
serially in some order.
Suppose T has been executed till Read (Y) and then T’’ starts. As a result , interleaving of operations takes place due to which
T’’ reads correct value of X but incorrect value of Y and sum computed by
T’’: (X+Y = 50, 000+500=50, 500)
This results in database inconsistency, due to a loss of 50 units. Hence, transactions must take place in isolation and changes
should be visible only after a they have been made to the main memory.
Durability:
This property ensures that once the transaction has completed execution, the updates and modifications to the database are stored
in and written to disk and they persist even is system failure occurs. These updates now become permanent and are stored in a
non-volatile memory. The effects of the transaction, thus, are never lost.
The ACID properties, in totality, provide a mechanism to ensure correctness and consistency of a database in a way such that each
transaction is a group of operations that acts a single unit, produces consistent results, acts in isolation from other operations and
updates that it makes are durably stored.
This article is contributed by Avneet Kaur. If you like GeeksforGeeks and would like to contribute, you can also write an article
using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the
GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
What are different types of Data Models available for expressing data in DBMS?
PDF file
● CREATE
● ALTER
● DROP
Pupil Table with his ID and name is created by the DDL statement
Generally, the data types often used consists of strings and dates while creating a table. Every system varies
in how to specify the data type.
An existing database object can be modified by the ALTER statement. Using this command, the users can
add up some additional column and drop existing columns. Additionally, the data type of columns involved
in a database table can be changed by the ALTER command.
The general syntax of the ALTER command is mentioned below:
ALTER TABLE table_name ADD column_name (for adding a new column)
ALTER TABLE table_name MODIFY column_name data type (for modifying a column)
For Example
Add column to the pupil table
ALTER TABLE PUPIL ADD PHONE NUMBER varchar 97
97 Albert
98 Sameer
97 ALBERT
98 SAMEER
By using the Truncate command, the users can remove the table content, but the structure of the table is
kept. In simple language, it removes all the records from the table structure. The users can’t remove data
partially through this command. In addition to this, every space allocated for the data is removed by
Truncate command.
The syntax of the Truncate command is mentioned below:
TRUNCATE TABLE table_name;
1. Procedural Programming
In this type, the user will specify what data is required and how to get it.
1. Declarative Programming
Here, the user will only specify what data is required.
Commands
The DML section of SQL consists of following a set of commands:
● From
A relation name is taken by this clause as an argument from where attributes are to be projected or selected.
● Where
The predictions or conditions that should match for qualifying the attributes to be projected are defined by
this clause.
For Example
Select author_name
From book_set
Where age> 40
The names of the authors will be yielded by the command from the relation book_set whose age is greater
than 40.
For Example
Consider a table Pupil with the following fields.
Insert
into Student Values( 78,’Nicole’, 8)
The above command will insert a record into Pupil table
Delete
from Pupil where P_id=80
The above-mentioned command will delete the record where P_id is 80 from Pupil Table.
So it
was all about DML (Data Manipulation Command), If you have any query then please comment below.
Explain significance of Database Design. Discuss Logical and Physical types of Database Design.
Database Approach
In order to remove all limitations of the File Based Approach, a new approach was required that must be
more effective known as Database approach
The Database is a shared collection of logically related data, designed to meet the information needs of an
organization. A database is a computer based record keeping system whose overall purpose is to record and
maintains information. The database is a single, large repository of data, which can be used simultaneously
by many departments and users. Instead of disconnected files with redundant data, all data items are
integrated with a minimum amount of duplication.
The database is no longer owned by one department but is a shared corporate resource. The database holds
not only the organization's operational data but also a description of this data. For this reason, a database is
also defined as a self-describing collection of integrated records. The description of the data is known as the
Data Dictionary or Meta Data (the 'data about data'). It is the self-describing nature of a database that
provides program-data independence.
A database implies separation of physical storage from use of the data by an application program to achieve
program/data independence. Using a database system, the user or programmer or application specialist need
not know the details of how the data are stored and such details are "transparent to the user". Changes (or
updating) can be made to data without affecting other components of the system. These changes include, for
example, change of data format or file structure or relocation from one device to another.
In the DBMS approach, application program written in some programming language like Java, Visual
Basic.Net, and Developer 2000 etc. uses database connectivity to access the database stored in the disk with
the help of operating system's file management system.
Comment on Entity relationship Model with the help of suitable example.Explain Entity, Relationship, its
attributes, Roles, Cardinality constraints, Total and Partial Participation, weak entity sets with good
example.
An Entity may be an object with a physical existence – a particular person, car, house, or employee – or it may be an object with a
conceptual existence – a company, a job, or a university course.
An Entity is an object of Entity Type and set of all entities is called as entity set. e.g.; E1 is an entity having Entity Type Student
and set of all students is called Entity Set. In ER diagram, Entity Type is represented as:
Attribute(s):
Attributes are the properties which define the entity type. For example, Roll_No, Name, DOB, Age, Address, Mobile_No are
the attributes which defines entity type Student. In ER diagram, attribute is represented by an oval.
Key Attribute –
The attribute which uniquely identifies each entity in the entity set is called key attribute.For example, Roll_No
will be unique for each student. In ER diagram, key attribute is represented by an oval with underlying lines.
Composite Attribute –
An attribute composed of many other attribute is called as composite attribute. For example, Address attribute of
student Entity type consists of Street, City, State, and Country. In ER diagram, composite attribute is represented by an
oval comprising of ovals.
Multivalued Attribute –
An attribute consisting more than one value for a given entity. For example, Phone_No (can be more than one for
a given student). In ER diagram, multivalued attribute is represented by double oval.
Derived Attribute –
An attribute which can be derived from other attributes of the entity type is known as derived attribute. e.g.;
Age (can be derived from DOB). In ER diagram, derived attribute is represented by dashed oval.
The complete entity type Student with its attributes can be represented as:
Relationship Type and Relationship Set:
A relationship type represents the association between entity types. For example,‘Enrolled in’ is a relationship type that exists
between entity type Student and Course. In ER diagram, relationship type is represented by a diamond and connecting the entities
with lines.
A set of relationships of same type is known as relationship set. The following relationship set depicts S1 is enrolled in C2, S2 is
enrolled in C1 and S3 is enrolled in C3.
Binary Relationship –
When there are TWO entities set participating in a relation, the relationship is called as binary relationship.For
example, Student is enrolled in Course.
n-ary Relationship –
When there are n entities set participating in a relation, the relationship is called as n-ary relationship.
Cardinality:
The number of times an entity of an entity set participates in a relationship set is known as cardinality. Cardinality can be of
different types:
One to one – When each entity in each entity set can take part only once in the relationship, the cardinality is one to one. Let us
assume that a male can marry to one female and a female can marry to one male. So the relationship will be one to one.
Many to one – When entities in one entity set can take part only once in the relationship set and entities in other entity set
can take part more than once in the relationship set,cardinality is many to one. Let us assume that a student can take only one
course but one course can be taken by many students. So the cardinality will be n to 1. It means that for one course there can be n
students but for one student, there will be only one course.
In this example, student S1 is enrolled in C1 and C3 and Course C3 is enrolled by S1, S3 and S4. So it is many to
many relationships.
Participation Constraint:
Total Participation – Each entity in the entity set must participate in the relationship. If each student must enroll in a
course, the participation of student will be total. Total participation is shown by double line in ER diagram.
Partial Participation – The entity in the entity set may or may NOT participate in the relationship. If some courses
are not enrolled by any of the student, the participation of course will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set having total participation and Course
Entity set having partial participation.
Every student in Student Entity set is participating in relationship but there exists a course C4 which is not taking part
in the relationship.
Weak Entity Type and Identifying Relationship:
As discussed before, an entity type has a key attribute which uniquely identifies each entity in the entity set. But there exists some
entity type for which key attribute can’t be defined. These are called Weak Entity type.
For example, A company may store the information of dependants (Parents, Children, Spouse) of an Employee. But the
dependents don’t have existence without the employee. So Dependent will be weak entity type and Employee will be Identifying
Entity type for Dependant.
A weak entity type is represented by a double rectangle. The participation of weak entity type is always total. The relationship
between weak entity type and its identifying strong entity type is called identifying relationship and it is represented by double
diamond.
Types of Normalization
Following are the types of Normalization:
1. First Normal Form
2. Second Normal Form
3. Third Normal Form
4. Fourth Normal Form
5. Fifth Normal Form
6. BCNF (Boyce – Codd Normal Form)
7. DKNF (Domain Key Normal Form)
1. First Normal Form (1NF)
● It contains atomic values because the table cannot hold multiple values.
Example: Employee Table
1 ABC Sales
1 ABC Production
3 XYZ Marketing
● The main rule of 2NF is, 'No non-prime attribute is dependent on the proper subset of any candidate key
of the table.'
1 ABC 38
1 ABC 38
2 PQR 38
3 XYZ 40
3 XYZ 40
●
● Candidate Key: ECode, Employee_Name
● The above table is in 1NF. Each attribute has atomic values. However, it is not in 2NF because non
prime attribute Employee_Age is dependent on ECode alone, which is a proper subset of candidate key.
This violates the rule for 2NF as the rule says 'No non-prime attribute is dependent on the proper subset
of any candidate key of the table'.
1 38
2 38
3 40
● Employee2 Table
ECode Employee_Name
1 ABC
1 ABC
2 PQR
3 XYZ
3 XYZ
● Now, the above tables comply with the Second Normal Form (2NF).
● While using the 2NF table, there should not be any transitive partial dependency.
● 3NF reduces the duplication of data and also achieves the data integrity.
● In the above <Employee> table, EId is a primary key but City, State depends upon Zip code.
● The dependency between Zip and other fields is called Transitive Dependency.
● Therefore we apply 3NF. So, we need to move the city and state to the new <Employee_Table2> table,
with Zip as a Primary key.
● <Employee_Table1> Table
EId Ename DOB Zip
● <Employee_Table2> Table
City State Zip
● The advantage of removing transitive dependency is, it reduces the amount of data dependencies and
achieves the data integrity.
● In the above example, using with the 3NF, there is no redundancy of data while inserting the new
records.
● The City, State and Zip code will be stored in the separate table. And therefore the updation becomes
more easier because of no data redundancy.
● BCNF which stands for Boyce – Code Normal From is developed by Raymond F. Boyce and E. F. Codd
in 1974.
● It deals with the certain type of anomaly which is not handled by 3NF.
● A table complies with BCNF if it is in 3NF and any attribute is fully functionally dependent that is A →
B. (Attribute 'A' is determinant).
● Candidate key has the ability to become a primary key. It is a column in a table.
● DeptName → DeptType
● Candidate Key:
● Empid
● DeptName
● The above table is not in BCNF as neither Empid nor DeptName alone are keys.
● We can break the table in three tables to make it comply with BCNF.
● <Employee> Table
Empid EmpName
E001 ABC
E002 XYZ
● <Department> Table
DeptName DeptType
Production D001
Sales D002
● <Emp_Dept> Table
Empid DeptName
E001 Production
E002 Sales
● DeptName → DeptType
● Candidate Key:
● <Employee> Table : Empid
● So, now both the functional dependencies left side part is a key, so it is in the BCNF.
● 4NF builds on the first three normal forms (1NF, 2NF and 3NF) and the BCNF.
● Therefore, 4NF states that a table should not have more than one dependencies.
● In 5NF, if an attribute is multivalued attribute, then it must be taken out as a separate entity.
DKNF stands for Domain Key Normal Form requires the database that contains no constraints other than
domain constraints and key constraints.
It avoids general constraints in the database which are not clear domain or key constraints.
The 3NF, 4NF, 5NF and BCNF are special cases of the DKNF.
It is achieved when every constraint on the relation is a logical consequence of the definition.
Explain Specialization, Generalization and Aggregation Extended ER features with suitable examples.
DBMS | ER Model: Generalization, Specialization and Aggregation
Generalization –
Generalization is the process of extracting common properties from a set of entities and create a
generalized entity from it. It is a bottom-up approach in which two or more entities can be
generalized to a higher level entity if they have some attributes in common. For Example,
STUDENT and FACULTY can be generalized to a higher level entity called PERSON as shown in
Figure 1. In this case, common attributes like P_NAME, P_ADD become part of higher entity
(PERSON) and specialized attributes like S_FEE become part of specialized entity (STUDENT).
Specialization –
In specialization, an entity is divided into sub-entities based on their characteristics. It is a top-down
approach where higher level entity is specialized into two or more lower level entities. For Example,
EMPLOYEE entity in an Employee management system can be specialized into DEVELOPER,
TESTER etc. as shown in Figure 2. In this case, common attributes like E_NAME, E_SAL etc.
become part of higher entity (EMPLOYEE) and specialized attributes like TES_TYPE become part
of specialized entity (TESTER).
Aggregation –
An ER diagram is not capable of representing relationship between an entity and a relationship
which may be required in some scenarios. In those cases, a relationship with its corresponding
entities is aggregated into a higher level entity. For Example, Employee working for a project may
require some machinery. So, REQUIRE relationship is needed between relationship WORKS_FOR
and entity MACHINERY. Using aggregation, WORKS_FOR relationship with its entities
EMPLOYEE and PROJECT is aggregated into single entity and relationship REQUIRE is created
between aggregated entity and MACHINERY.
Explain significance of Super key, Candidate key, Primary Key and Foreign Key with example.
Various Keys in Database Management System (DBMS)
Various Keys in Database Management System (DBMS): For the clarity in DBMS, keys are preferred
and they are important part of the arrangement of a table. Keys make sure to uniquely identify a
table’s each part or record of a field or combination of fields. A database is made up of tables, which
(tables) are made up of records, which (records) further made up of fields. Let us take an example to
illustrate what are keys in database management system. This article is about different keys in
database management system (DBMS).
Example:
Types of Keys in Database Management System: Each key which has the parameter of uniqueness is as
follows:
1. Super key 7. Non- prime attribute
2. Candidate key 8. Foreign key
3. Primary key 9. Simple key
4. Composite key 10. Compound key
5. Secondary or Alternative key 11. Artificial key
6. Non- key attribute
The detailed explanation of all the mentioned keys is as follows:
1. Super key
Super Key is a set of properties within a table; it specially identifies each record in a table. Candidate
key is a unique case of super key.
● For example: Roll No. of a student is unique in relation. The set of properties like roll no., name,
class, age, sex, is a super key for the relation student.
1. Candidate keys
Candidate keys are the set of fields; primary key can be selected from these fields. A set of properties
or attributes acts as a primary key for a table. Every table must have at least one candidate key or several
candidate keys. It is a super key’s subset.
● Example:
1. Primary key
The candidate key which is very suitable to be the main key of table is a primary key.
1. Composite key
Composite Key has two or more properties which specially identifies the occurrence of an entity.
● Example:
● In the above example the customer identity and order identity has to combine to uniquely identify the
customer details.
● Example:
1. Non-key Attribute
The attributes excluding the candidate keys are called as non-key attributes.
● Example:
1. Non-prime Attribute
● Example:
Non prime attributes
1. Foreign key
Generally foreign key is a primary key from one table, which has a relationship with another table.
● Example:
1. Simple key
Simple keys have a single field to specially recognize a record. The single field cannot be divided
into more fields. Primary key is also a simple key.
● Example: In the below example student id is a single field because no other student will have same
Id. Therefore, it is a simple key.
1. Compound key
Compound key has many fields to uniquely recognize a record. Compound key is different from
composite key because any part of this key can be foreign key but in composite key its part may or may not
be a foreign key.
1. Surrogate/Artificial key
Surrogate key is artificially generated key and its main purpose it to be the primary key of table.
Artificial keys do not have meaning to the table.There are few properties of surrogate or artificial keys.
They are unique because these just created when you don’t have any natural primary key.
They are integer values.
One cannot find the meaning of surrogate keys in the table.
End users cannot surrogate key.
Surrogate keys are allowed when