You are on page 1of 35

AC IV CTI-EN

7th Sem.

Asoc.Prof.Dr.Eng. Dan Pescaru


About the Database Design
Course
Assoc.Prof.Dr.Eng. Dan Pescaru
Monday 14:00-16:00 A212
Web site: www.cs.upt.ro/~dan/curs/dbd/
Labs
As.Drd.Eng. Ovidiu Parvu
B623
Evaluation
Lab mark (1/3)
Exam mark (2/3) written exam
Why Databases?
Modern world: Data -> Information -> Knowledge
Efficient Manipulation of Large Data Sets
Integration of Multiple Data Sources
The answer to these:
Database (DB): a organized collection of logical related
data. Could maintain massively distributed data.
Database Management System (DBMS): a system that
stores and retrieves information from a database. It
offers support for access control, security, concurrency,
recovery etc.
Modern Databases
Part of our lives
Most websites uses databases (e.g. Google, Yahoo, MSN)
Most payments involve databases (ATM, POS, Online)
A hotel room is obtained by searching from a database
(Booking, Trivago, Kayak etc.)
Online learning platforms are built on databases (e.g.
Moodle, Sake, Blackboard etc.)
In general, most of business involve databases (from
individual business to global corporations)
Good news: lots of well paid jobs in database industry
Contents
Introduction
Oracle DBMS
Oracle SQL
Oracle PL/SQL
DB Administration
DB App. Development
Object-Relational DBMS
Oracle Object-relational Model
NoSQL Databases
Introduction
Database Management Systems
DBMS Architecture
DBMS Services
DB Related Jobs
Review
The Relational Model
Relational Algebra
Relational Query Languages
Oracle DBMS
Short History
Oracle Database
Oracle Express Edition, Oracle Database Lite, MySQL
Oracle Apex
Oracle Database Architecture
Oracle Processes
Oracle Administrative Tools
Oracle Warehouse Builder and Oracle OLAP
Oracle SQL
DDL
DML
DCL
Sequences
Views
Transactions
Oracle PL/SQL
Instruction set
Blocks
Data types
Cursors
Packages
Functions
Exception model
Triggers
Intro on Database Administration
Introduction
DB Configuration
Users Management
Roles Management
DB Objects Security
DB Maintenance
DB Benchmarks
Oracle Administration Tools
Database App. Development
Stored Procedures
DB-Call Level Interface
Embedded SQL
3rd Tier Architectures
Oracle APEX
JDBC
ODBC
Object-Relational DBMS
Object-Relational Data Model
Nested Relations
SQL99. Complex Types
Persistent Objects
Inheritance
Path Expressions
OQL
Oracle Object-Relational Model
Oracle Objects
Advantage of using Objects
Oracle O-R Model
Table, Row and Column Objects
Oracle Reference Typed Values
Methods in Oracle
Language Binding Features PL/SQL Extensions
Oracle Call Interface
NoSQL Databases
Next Generation Databases - Non-relational,
Distributed, Horizontally Scalable ?
Web-scale Databases
NoSQL Data Model
Query Methods
Wide Column Store Systems (e.g. Cassandra)
Document Store Systems (e.g. MongoDB)
Key Value/Tuples Store Systems (e.g. Oracle NoSQL)
Graph Databases (e.g. Neo4J)
Review: Relational model & Databases
Database Management System
DBMS Services
DBMS Architecture
The Relational Model
Relational Algebra
Constraints
Normalization
Relational Query Languages
DBMS
Database Management System
A collection of programs for managing Databases
Basic functions: store, retrieve, modify and delete data
Usually has a Client-Server architecture
Includes a query language (navigational or declaraive)
Lot of additional functions: user management,
transactions management, recovery mechanisms etc.
Maintain data (e.g. tables) and metadata (e.g. DB
catalog)
DBMS Services
Consistency of Data
Static consistency (e.g. age > 0 for a person)
Dynamic consistency (e.g. a payment means same
amount as debit in one account and credit in another)
Integration of Data
Combining data residing in different sources/formats
and providing users with a unified view of these data
Sharing of Data - Concurrency Control
Minimal Data Redundancy Normalization,
Anomalies
DBMS Benefits
Ease of Application Development
Stored Procedures, Triggers, Specialized IDE (e.g. Apex)
Uniform Security, Privacy - Integrity Controls
Access Control: users, pass, roles
Specific encryption protocols
Input and Data validation, Backup mechanisms
Data Accessibility and Responsiveness
E.g. queries distribution over several servers 2014
estimation: ~ 1M servers on Google data center.
Data Independence Logical, Physical and View levels
DBMS Architecture
3 tier
Physical level (storage view) how the data are stored
on the disks
Conceptual level (community user view) how the data
is viewed by a community of users (e.g. different
departments of a company)
External level (individual user view) how the data is
viewed by a single user (specific end-user)
The database schema at conceptual level is usually a
union of external schemas
DBMS Administrators
Define conceptual schema
Define internal (physical) schema
Create and maintain the database
Define and implements the security layer
Define integrity checks
Define backup and recovery procedures
Monitor performance and decide when to reorganize
the database
Solve specific end-users issues
DB Designers
Investigate and extract end-users requirements
Define external schemas
Define all necessary views
Implement embedded business logic (stored
procedures or libraries)
Implement external business logic (as part of end-
user application programs)
Implement the presentation layer (end user forms,
reports etc.)
Content Motivation
Why Oracle: nowadays de the facto industrial
standard. The bigger player (www.idc.com) on:
the business analytics market (17.9% - 2014)
RDBMS market (45% - 2013 ).
Why NoSQL: to overcome the limitations of RDBs.
Manage unstructured/semi-structured data: Log files,
Blogs, Tweets, Text-Audio-Video streams etc.
For: BigData, Internet of Things, Cloud Computig etc.
IDC estimates that in 2013 the combined size of the
worlds digital data was 4.4 zettabytes (trillion gigabytes)
and by 2020 it will grow ten times to 44 zettabytes.
Review The Relational Model
Relational database: a collection formed by relations and
links between them
Relation:
Schema: relation name, attributes (columns) and attributes
domains (types)
Ex: Students(sid:String, name:String, year:Integer, grade:Real)
Instance: physical table, having a fixed number of columns
and rows (tuples)
Number of columns = relation DEGREE (ARITY)
Number of rows = relation CARDINALITY
Note: Unordered rows (records)!
Ordered columns (ex. SQL INSERT)
Relation Instance
Students
sid name year grade
AC2153 Pop Angela 2 8.50
AC1078 Avram Ioan 1 9.35
AC2056 Ionescu Mihai 2 7.80
AC3098 Georgescu Ana 3 9.00
AC3023 Mihu Andrei 3 6.30

The Students relation


Degree: 4
Cardinality: 5
Obs: Contains only distinct rows
Relational Constraints
Integrity constraints - a collection of logical
statements (expressions) that must be satisfied by any
instance of the database
A database instance that satisfies all integrity
constraints is called legal (or in a consistent state)
Primary key (PK) a key chosen by the DB designer
from the set of keys to uniquely identify a row
Foreign Key (FK) links two tables (corresponds to the
primary key of the main relation), used to check the
referential integrity
Relational Algebra
Basic operators:
Projection () Delete unwanted columns from result
Selection () selects a subset of rows from a relation
Cartesian product () allows to combine two relations
Set difference () tuples in R1, but not in R2
Union () tuples in R1 and in R2 (discard duplicates)
Additional operations:
JOIN (), Intersection, Division, Renaming
Since each operation returns a relation, operations
can be composed (Algebra is closed)
Data Redundancy
Redundancy is at the root of several problems
associated with relational schemas
Waste of storage space
Insert/delete/update anomalies
Main refinement technique: decomposition
Functional dependencies used to detect redundancy
A functional dependency XY holds over relation R
if, for every allowable instance of R:
t1R, t2R, X(t1)=X(t2) y(t1)=y(t2)
Review Normalization
Solution to minimize the redundancy: normalization
using decomposition
Reason: if a relation is in a certain normal form
certain kinds of problems are avoided/ minimized
FDs could be used in detecting redundancy (R:ABC)
No FDs hold: There is no redundancy here!
Given AB: Several tuples could have the same A value,
and if so, they will all have the same B value!
Five normal forms. A DB is considered to be
normalized if it is in the 3 NF
Lossless Join Decomposition
Decomposition of R into X and Y is lossless-join
with respect to a set of FDs F if, for every instance
r that satisfies F:
X(r) Y(r) = r

It is essential that all decompositions used to deal


with redundancy be lossless !
Dependency Preservation
Decomposition have to keep all dependencies of
the original relation in order to keep all
constraints!
Condition: (F1 F2)+ = F+
Example
R=(A,B,C), F={AB, AC, BC}
R1=(A,C), R2(B,C)
F1+={AA, CC, AC, ACAC}
F2+={BB, CC, BC, BCBC}
Note: AB is not preserved
First Normal Form
First normal form (1NF):
The domain of each attribute must contains only
atomic values; composite fields or "relationships into
relationship" are prohibited
Each attribute contains only a single value in the
domain
Students
sid name hobby fid faddress

AC6978 Popescu Mihai chess, dance AC Timisoara, Parvan no 3, 0256112212

AC8967 Ionescu Georgeta reading, painting AC Timisoara, Parvan no 1 ???, 0256112212


Second Normal Form
Second normal form (2NF):
Relation is already in 1NF
Any non-prime attribute of R (not part of the primary
key) must be fully functionally dependent on the
primary key of R
OR: there are no attributes that depend only on a part
of the primary key
Invoice
id date poz name price

008978 01-03-2014 1 bread 2

008978 01-03-2014 2 apples 6

099488 05-03-2014 1 cheese 17


Third Normal Form
Third normal form (3NF):
Relation is already in 2NF
Every non-prime attribute is non-transitively
dependent on every candidate key in the table. In other
words, no transitive dependency is allowed
BCNF: Every non-trivial functional dependency in the
table is a dependency on a superkey
Items
id poz name quantity unit_price price

008978 1 bread 2 2 4

008978 2 apples 5 6 30

099488 1 cheese 1 17 17
Fourth Normal Form
Fourth normal form (4NF):
Relation is already in 3NF
There are no multivalued dependencies
(Multivalued Dependencies considering the schema ABC,
multivalued dependency exists if to each A corresponds many B and
many C, but the B and C are independent of each other)
Components (3NF)
Department Project Component
Components
D1 Pr1 C1
Department Project Component
D1 Pr1 C2
D1 Pr1 C1
D1 Pr1 C3
Pr2 C2
C3 D1 Pr2 C1

D2 Pr2 C2 D2 Pr2 C2
Pr3 C4 ... ... ...
Pr5
D2 Pr5 C4
Relational Query Languages
Relational query language: designed for data finding,
retrieving and management
Relational model enable simple and powerful query
languages (e.g. SQL, QBE):
Strong formal foundation based on algebra/logic
Allows fine optimization
Query languages are not programming languages:
Pure QL are not turing-complete (SQL92), but extensions as
PL/SQL could be
E.g. Recursive closure could not be expressed
But they provide efficient data access!