Академический Документы
Профессиональный Документы
Культура Документы
Sr. Page
No. Practical Title Date No. Sign
Create two databases either on single DBMS and
Design Database to fragment and share the fragments
1.
from both database and write single query for creating
view.
150180107003 1
Distributed DBMS(2170714)
PRACTICAL – 1
AIM: Create two databases either on single DBMS and Design Database to fragment and
share the fragments from both database and write single query for creating view.
Solution:
Creating table computers in database manoj,
150180107003 2
Distributed DBMS(2170714)
150180107003 3
Distributed DBMS(2170714)
PRACTICAL – 2
AIM: Create two databases on two different computer systems and create database view
to generate single DDB.
Solution:
Creating table computers in database manoj(in Computer 1),
150180107003 4
Distributed DBMS(2170714)
execsp_addlinkedsrvlogin@rmtsrvname=’localhost’,@useself=false,@rmtuser=’computer1’,@r
mtpassword=’1234’;
Joining the Code :
execsp_addlinkedsrvlogin@rmtsrvname=’(36061)’,@useself=false,@rmtuser=’computer2,@rm
tpassword=’4321’;
150180107003 5
Distributed DBMS(2170714)
PRACTICAL – 3
AIM: Create various views using any one of examples of database and Design various
constraints.
Solution:
VIEW :
CONSTRAINTS :
SQL constraints are used to specify rules for the data in a table.
Constraints are used to limit the type of data that can go into a table.
This ensures the accuracy and reliability of the data in the table. If there is any violation
between the constraint and the data action, the action is aborted.
The following constraints are commonly used in SQL:
NOT NULL - Ensures that a column cannot have a NULL value.
UNIQUE - Ensures that all values in a column are different.
KEY - A combination of a NOT NULL and UNIQUE. Uniquely identifies each row in a
PRIMARY
table.
FOREIGN KEY - Uniquely identifies a row/record in another table.
CHECK - Ensures that all values in a column satisfies a specific condition.
DEFAULT - Sets a default value for a column when no value is specified.
INDEX - Used to create and retrieve data from the database very quickly.
Examples,
150180107003 6
Distributed DBMS(2170714)
NOT NULL
UNIQUE
150180107003 7
Distributed DBMS(2170714)
PRACTICAL – 4
AIM: Write and Implement algorithm for query processing using any of Example in either C
/C++ /Java / .NET.
Solution:
Query Processing includes translations on high level Queries into low level expressions that can
be used at physical level of file system, query optimization and actual execution of query to get
the actual result.
Parser: During parse call, the database performs the following checks- Syntax check,
Semantic check and Shared pool check, after converting the query into relational algebra.
Parser performs the following checks as (refer detailed diagram):
Syntax check concludes SQL syntactic validity. Example:
Semantic check determines whether the statement is meaningful or not. Example: query
contains a table_name which does not exist is checked by this check.
Shared Pool check Every query possess a hash code during its execution. So, this check
determines existence of written hash code in shared pool if code exists in shared pool then
database will not take additional steps for optimization and execution.
150180107003 8
Distributed DBMS(2170714)
Optimizer: During optimization stage, database must perform a hard parse atleast for one unique
DML statement and perform optimization during this parse. This database never optimizes DDL
unless it includes a DML component such as subquery that require optimization.
It is a process in which multiple query execution plan for satisfying a query are examined and most
efficient query plan is satisfied for execution.
Database catalog stores the execution plans and then optimizer passes the lowest cost plan
for execution.
Execution Engine: Finally runs the query and display the required result.
150180107003 9
Distributed DBMS(2170714)
PRACTICAL – 5
AIM: Using any of example, write various Transaction statement and show the information
about concurrency control [i.e. various lock’s from dictionary] by executing multiple update
and queries.
Solution:
INTRODUCTION
When more than one transactions are running simultaneously there are chances of a conflict
to occur which can leave database to an inconsistent state.
To handle these conflicts we need concurrency control in DBMS, which allows transactions to
run simultaneously but handles them in such a way so that the integrity of data remains intact.
LOCKS:
A lock is kind of a mechanism that ensures that the integrity of data is maintained. There are
two types of a lock that can be placed while accessing the data so that the concurrent
transaction can not alter the data while we are processing it.
1. Shared Lock(S)
2. Exclusive Lock(X)
1. Shared Lock(S): Shared lock is placed when we are reading the data, multiple shared locks
can be placed on the data but when a shared lock is placed no exclusive lock can be placed.
2. Exclusive Lock(X): Exclusive lock is placed when we want to read and write the data. This lock
allows both the read and write operation, Once this lock is placed on the data no other lock
(shared or Exclusive) can be placed on the data until Exclusive lock is released.
S X
S True False
X False False
There are two rows, first row says that when S lock is placed, another S lock can be acquired
so it is marked true but no Exclusive locks can be acquired so marked False.
In second row, When X lock is acquired neither S nor X lock can be acquired so both
marked false.
150180107003 10
Distributed DBMS(2170714)
Lock Examples,
Viewing details,
150180107003 11
Distributed DBMS(2170714)
PRACTICAL – 6
AIM: Using Transaction /commit rollback, Show the transaction ACID properties.
Solution:
Atomicity This property states that a transaction must be treated as an atomic unit, that is,
either all of its operations are executed or none
Consistency The database must remain in a consistent state after any transaction. No
transaction should have any adverse effect on the data residing in the database.
Durability The database should be durable enough to hold all its latest updates even if the
system fails or restarts.
Isolation In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions will be
carried out and executed as if it is the only transaction in the system.
150180107003 12
Distributed DBMS(2170714)
150180107003 13
Distributed DBMS(2170714)
PRACTICAL – 7
AIM: Write java JDBC program and use JTA to show various isolation level’s in transaction.
Solution:
TRANSACTION
When auto-commit mode is disabled, no sql statements are committed until you call the
method commit explicitly.
All statements executed after the previous call to the method commit are included in the
current transactions and committed together as a unit.
-----------------------------------------------JAVA Programming------------------------------------------------------
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
150180107003 14
Distributed DBMS(2170714)
Statement st = conn.createStatement();
id = keys.getInt(1);
}
conn.commit();
System.out.println("Transaction commit...");
System.out.println("Connection rollback...");
e.printStackTrace();
} finally {
150180107024 15
Distributed DBMS(2170714)
}
}
}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.Savepoint;
import java.sql.Statement;
conn.commit();
150180107003 16
Distributed DBMS(2170714)
conn.rollback (mySavepoint);
st.close();
conn.close();
}
Class.forName("org.hsqldb.jdbcDriver");
String url = "jdbc:hsqldb:mem:data/tutorial";
• Bitronix
• Atomikos
• RedHat Narayana
In this test, we are going to use Bitronix:
<bean id="jtaTransactionManager" factory-method="getTransactionManager"
class="bitronix.tm.TransactionManagerServices" depends-on="btmConfig, dataSource"
destroy-method="shutdown"/>
<bean id="transactionManager"
class="org.springframework.transaction.jta.JtaTransactionManager">
<property name="transactionManager" ref="jtaTransactionManager"/>
150180107003 17
Distributed DBMS(2170714)
</bean>
FOLLOWING EXCEPTION:
OUTPUT:
DEBUG [main]: c.v.s.i.StoreServiceImpl - Transaction isolation level is READ_COMMITTED
Even with this extra configuration, the transaction-scoped isolation level wasn’t propagated to the
underlying database connection, as this is the default JTA transaction manager behavior.
150180107003 18
Distributed DBMS(2170714)
PRACTICAL – 8
AIM: Implement Two Phase Commit Protocol.
Solution:
------------------------------------------PYTHON Programing---------------------------------------------------------
'''
https://en.wikipedia.org/wiki/Two-phase_commit_protocol
'''
logging.basicConfig(format=_fmt)
LOG = logging.getLogger(__name__)
LOG.setLevel(logging.DEBUG)
MIN_ACCOUNT = 0
MAX_ACCOUNT = 100
NO_COHORTS = 2
class Coordinator(Thread):
def __init__(self):
Thread.__init__(self)
self.start_sem = Semaphore(0)
self.cohorts = []
self.votes = []
self.acks = []
self._log_extra = dict(user='COORD')
150180107003 19
Distributed DBMS(2170714)
def yes(self):
self.votes.append(True)
def no(self):
self.votes.append(False)
def ack(self):
self.acks.append(True)
self.cohorts.append(cohort)
self.start_sem.release()
def run(self):
self.start_sem.acquire(NO_COHORTS)
## Voting Phase:
cohort.query_to_commit()
## Commit Phase:
if all(self.votes):
150180107003 20
Distributed DBMS(2170714)
cohort.commit()
else:
if all(self.acks):
LOG.info('END', extra=self._log_extra)
else:
cohort.end()
class Cohort(Thread):
Thread.__init__(self)
self.uname = uname
self.coord = coord
self.do = None
self.undo = None
self.sem = Semaphore(0)
self.lock = Lock()
150180107003 21
Distributed DBMS(2170714)
self._log_extra = dict(user=uname)
def query_to_commit(self):
## Voting phase:
if self.res:
self.coord.yes()
else:
self.coord.no()
def commit(self):
self.commit = True
def rollback(self):
self.commit = False
def end(self):
self.sem.release()
def run(self):
150180107003 22
Distributed DBMS(2170714)
## Voting phase:
# will be asked to commit. They each write an entry to their undo log
for do in self.do:
do()
## Commit phase:
if self.commit:
LOG.info('commit', extra=self._log_extra)
else:
# 2. Each cohort undoes the transaction using the undo log ...
for undo in
self.undo: undo()
LOG.info('rollback', extra=self._log_extra)
# 2. ... and releases all the locks and resources held during the
# transaction.
self.lock.release()
150180107003 23
Distributed DBMS(2170714)
self.coord.ack()
if __name__ == '__main__':
coord = Coordinator()
u1 = Cohort('user1', coord)
u2 = Cohort('user2', coord)
def u1_do():
u1.account -= amount
def u1_undo():
u1.account += amount
def u2_do():
u2.account += amount
def u2_undo():
u2.account -= amount
u1.do = [u1_do, ]
u2.do = [u2_do, ]
u1.undo = [u1_undo, ]
u2.undo = [u2_undo, ]
coord.start()
150180107003 24
Distributed DBMS(2170714)
u1.start()
u2.start()
u2.join()
u1.join()
coord.join()
OUTPUT:
150180107003 25
Distributed DBMS(2170714)
PRACTICAL – 9
AIM: Case study on noSQL.
Solution:
INTRODUCTION
NoSQL encompasses a wide variety of different database technologies that were developed
in response to the demands presented in building modern applications:
Developers are working with applications that create massive volumes of new, rapidly
changing data types — structured, semi-structured, unstructured and polymorphic data.
Long gone is the twelve-to-eighteen month waterfall development cycle. Now small
teams work in agile sprints, iterating quickly and pushing code every week or two, some
even multiple times every day.
Applications that once served a finite audience are now delivered as services that must
be always-on, accessible from many different devices and scaled globally to millions of
users.
Organizations are now turning to scale-out architectures using open source software,
commodity servers and cloud computing instead of large monolithic servers and storage
infrastructure.
Relational databases were not designed to cope with the scale and agility challenges
that face modern applications, nor were they built to take advantage of the commodity
storage and processing power available today.
150180107003 26
Distributed DBMS(2170714)
SCALABILITY
NoSQL databases were designed when cloud computing and server clusters were already the
de facto standard. So you'd think they'd be able to scale seamlessly.
In many respects, they can. But there are still scalability challenges. For example, not all
NoSQL databases are good at automating the process of sharding, which means spreading a
database across multiple nodes. If a database can't shard automatically, it can't scale up or
down automatically in response to fluctuating demand.
SQL databases are subject to these same sorts of problems. In many ways, in fact, SQL is even
worse at scaling than most NoSQL databases. Still, the fact that NoSQL is not completely
scalable in all situations constitutes another hurdle, especially as the DevOps revolution
makes the rest of the software stack more scalable than ever.
150180107003 27
Distributed DBMS(2170714)
PRACTICAL – 10
AIM: Case study on Hadoop.
Solution:
INTRODICTION
Hadoop is an open source distributed processing framework that manages data processing
and storage for big data applications running in clustered systems. It is at the center of a
growing ecosystem of big data technologies that are primarily used to support advanced
analytics initiatives, including predictive analytics, data mining and machine learning
applications. Hadoop can handle various forms of structured and unstructured data, giving
users more flexibility for collecting, processing and analyzing data than relational databases
and data warehouses provide.
Formally known as Apache Hadoop, the technology is developed as part of an open source
project within the Apache Software Foundation (ASF). Commercial distributions of Hadoop
are currently offered by four primary vendors of big data platforms: Amazon Web Services
(AWS), Cloudera, Hortonworks and MapR Technologies. In addition, Google, Microsoft and
other vendors offer cloud-based managed services that are built on top of Hadoop and
related technologies
Hadoop was created by computer scientists Doug Cutting and Mike Cafarella, initially to
support processing in the Nutch open source search engine and web crawler. After Google
published technical papers detailing its Google File System (GFS) and MapReduce
programming framework in 2003 and 2004, respectively, Cutting and Cafarella modified earlier
technology plans and developed a Java-based MapReduce implementation and a file system
modeled on Google's
In early 2006, those elements were split off from Nutch and became a separate Apache
subproject, which Cutting named Hadoop after his son's stuffed elephant. At the same time,
Cutting was hired by internet services company Yahoo, which became the first production user
of Hadoop later in 2006. (Cafarella, then a graduate student, went on to become a university
professor.
150180107003 28
Distributed DBMS(2170714)
Use of the framework grew over the next few years, and three independent Hadoop vendors
were founded: Cloudera in 2008, MapR a year later and Hortonworks as a Yahoo spinoff in
2011. In addition, AWS launched a Hadoop cloud service called Elastic MapReduce in 2009.
That was all before Apache released Hadoop 1.0.0, which became available in December 2011
after a succession of 0.x releases.
COMPONENTS OF HADOOP
The core components in the first iteration of Hadoop were MapReduce, the Hadoop
Distributed File System (HDFS) and Hadoop Common, a set of shared utilities and libraries. As
its name indicates, MapReduce uses map and reduce functions to split processing jobs into
multiple tasks that run at the cluster nodes where data is stored and then to combine what the
tasks produce into a coherent set of results. MapReduce initially functioned as both Hadoop's
processing engine and cluster resource manager, which tied HDFS directly to it and limited
users to running MapReduce batch applications.
That changed in Hadoop 2.0, which became generally available in October 2013 when version
2.2.0 was released. It introduced Apache Hadoop YARN, a new cluster resource management
and job scheduling technology that took over those functions from MapReduce. YARN -- short
for Yet Another Resource Negotiator but typically referred to by the acronym alone -- ended
the strict reliance on MapReduce and opened up Hadoop to other processing engines and
various applications besides batch jobs.
Even the remaining vendors have hedged their bets on Hadoop itself by expanding their big
data platforms to also include Spark and numerous other technologies. Spark, which runs
both batch and real-time workloads, has ousted MapReduce in many batch applications and
can bypass HDFS to access data from Amazon Simple Storage Service (S3) in the AWS cloud --
a capability supported by Cloudera and Hortonworks, as well as AWS itself. In 2017, both
Cloudera and Hortonworks dropped the word Hadoop from the names of their rival
conferences for big data users
150180107003 29