Вы находитесь на странице: 1из 27

Computer Laboratory - I

EL-1

Experiment No: 01
Title: Using Divide and Conquer Strategies
design a function for Binary Search using C.
Roll No:________
Batch:_____

Class:_____

Date of Performance: ___ /___/_____


Date of Assessment : ___ /___/_____

Particulars
Attendance (05)
Journal (05)
Performance (05)
Understanding(05)
Total (20)
Signature of Staf

Marks

Member

Experiment No 01
Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

TITLE:-Using Divide & Conquer Strategies design function for Binary Search using C++/Java
AIM:-

Implement a binary search using divide and conquer strategy

SOFTWARE REQUIRED:- Windows Operating Systems, JAVA/TurboC


PREREQUSITES :- Basic knowledge for c++ programming and java programming.
OBJECTIVES:

Understand the importance Divide and Conquer Strategies


To learn binary search

THEORY:
Divide & conquer Strategy:
A divide and conquer algorithm works by recursively breaking down a problem into two or more
sub-problems of the same (or related) type (divide), until these become simple enough to be
solved directly (conquer). The solutions to the sub-problems are then combined to give a
solution to the original problem.
Binary Search:
The binary search algorithm begins by comparing the target value to value of the middle
element of the sorted array. If the target value is equal to the middle element's value, the position
is returned. If the target value is smaller, the search continues on the lower half of the array, or if
the target value is larger, the search continues on the upper half of the array. This process
continues until the element is found and its position is returned, or there are no more elements
left to search for in the array and a "not found" indicator is returned.

Algorithm:
int binary_search(int A[], int key, int low, int high)
Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

{
if (high<low)
return KEY_NOT_FOUND;
else
{
int mid =( low+high)/2;
if (A[mid] > key)
return binary_search(A, key, low, mid - 1);
else if (A[mid] < key)
return binary_search(A, key, mid + 1, high);
else
return mid;
}
}
Time Complexity:
The binary search is a logarithmic algorithm and executes in O(log N) time

Example:
For example, consider the following sequence of integers sorted in ascending order and say we
are looking for the number 55:
0

13

19

22

41

55

68

72

81

98

We are interested in the location of the target value in the sequence so we will represent the
search space as indices into the sequence. Initially, the search space contains indices 1 through
11. Since the search space is really an interval, it suffices to store just two numbers, the low and
high indices. As described above, we now choose the median value, which is the value at index 6
(the midpoint between 1 and 11): this value is 41 and it is smaller than the target value. From this
we conclude not only that the element at index 6 is not the target value, but also that no element
at indices between 1 and 5 can be the target value, because all elements at these indices are
smaller than 41, which is smaller than the target value. This brings the search space down to
indices 7 through 11:
55

68

72

81

98

Proceeding in a similar fashion, we chop off the second half of the search space and are left with:
Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1
55

68

Depending on how we choose the median of an even number of elements we will either find 55
in the next step or chop off 68 to get a search space of only one element. Either way, we
conclude that the index where the target value is located is 7.

CONCLUSION
Thus we have studied & implemented Binary search algorithm.

FAQ
1. What is devide and concure strategy how it is used in binary search?
2. What is complexity of binary search ?
3. A binary search can be performed on both sorted and unsorted lists Justify.

4. In general, if L is a sorted list of size n, to determine whether an element is in L, the


binary search makes at most 2 * log2n + 2 key (item) comparisons, Justify.
Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I
Consider the following list.
int[] intList = {16, 30, 24, 7, 25, 62, 45, 5, 65, 50};
If intList above were sorted, what would be the middle element?
6. Consider the following list.
intList = {4, 18, 29, 35, 44, 59, 65, 98};
If intList were to be searched for the number 44 using a binary search,
key comparisons would have to be made?

EL-1

5.

Dept. of Computer Engg. (ZESs DCOER,Pune)

how many

Computer Laboratory - I

EL-1

Experiment No: 02
Title: Using Divide and Conquer Strategies
design a function for Concurrent Quick Sort
using C++.
Roll No:________
Batch:_____

Class:_____

Date of Performance: ___ /___/_____


Date of Assessment : ___ /___/_____

Particulars
Attendance (05)
Journal (05)
Performance (05)
Understanding(05)
Total (20)
Signature of Staf
Member

Dept. of Computer Engg. (ZESs DCOER,Pune)

Marks

Computer Laboratory - I

EL-1

Experiment No 02

TITLE:Using Divide and Conquer Strategies design a class for Concurrent Quick Sort
using C++.
AIM:-

Implement a Concurrent Quick Sort using divide and conquer strategy

SOFTWARE REQUIRED:- Ubuntu.


PREREQUSITES :- Basic knowledge for concurrent c++ programming.
OBJECTIVES:

Understand the importance Divide and Conquer Strategies


To learn Quick sort

THEORY:
Quick Sort:
QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and partitions the
given array around the picked pivot. There are many different versions of quickSort that pick
pivot in different ways.
1) Always pick first element as pivot.
2) Always pick last element as pivot
3) Pick a random element as pivot.
4) Pick median as pivot.
The key process in quickSort is partition(). Target of partitions is, given an array and an element
x of array as pivot, put x at its correct position in sorted array and put all smaller elements
(smaller than x) before x, and put all greater elements (greater than x) after x. All this should be
done in linear time.
Partition Algorithm:
There can be many ways to do partition. The logic is simple, we start from the leftmost element
and keep track of index of smaller (or equal to) elements as i. While traversing, if we find a
smaller element, we swap current element with pivot., Otherwise we ignore current element.
partition(array, lower, upper)
Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I
{
pivot is array[lower]
while (true)
{
scan from right to left using index called RIGHT
STOP when locate an element that should be left of pivot
scan from left to right using index called LEFT
stop when locate an element that should be right of pivot
swap array[RIGHT] and array[LEFT]
if (RIGHT and LEFT cross)
pos = location where LEFT/RIGHT cross
swap pivot and array[pos]
all values left of pivot are <= pivot
all values right of pivot are >= pivot
return pos
end pos
}
}

Dept. of Computer Engg. (ZESs DCOER,Pune)

EL-1

Computer Laboratory - I

EL-1

Example:

Time Complexity:

Best case complexity of quick sort is O(n log n)


Worst case Complexity is O (n2)
Concurrent Quick sort:
Quicksort can be parallelized in a variety of ways. In the context of recursive decomposition,
during each call of QUICKSORT, the array is partitioned into two parts and each part is solved
recursively. Sorting the smaller arrays represents two completely independent sub problems that
can be solved in parallel. Therefore, one way to parallelize quicksort is to execute it initially on a
single process; then, when the algorithm performs its recursive calls assign one of the sub
problems to one process & other to another process. Now each of these processes sorts its array
by using quicksort. The algorithm terminates when the arrays cannot be further partitioned. Upon

Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

termination, each process holds an element of the array, and the sorted order can be recovered by
traversing the processes.

CONCLUSION
Thus we have studied and implemented concurrent Quick sort.

Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

FAQ
1.
2.
3.
4.
5.

What is concurrent programming? How we do parrellel programing in c++?


What is time complexity of Quick sort algorithm?
What is the worst-case behavior (number of comparisons) for quick sort?
Quick sort uses Divide and Conquer Technique Justify.
What is the output of quick sort after the 2nd iteration given the following sequence of
numbers: 65 70 75 80 85 60 55 50 45(Ans:55 45 50 60 65 70 80 75 85)

Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

Experiment No: 10
Title: Implement Apriori approach for data
mining to organize the data items on a shelf
using table of items purchased in a Mall
Roll No:________
Batch:_____

Class:_____

Date of Performance: ___ /___/_____


Date of Assessment : ___ /___/_____

Particulars
Attendance (05)
Journal (05)
Performance (05)
Understanding(05)
Total (20)
Signature of Staf
Member

Dept. of Computer Engg. (ZESs DCOER,Pune)

Marks

Computer Laboratory - I

EL-1

Experiment No 10

TITLE:Implement Apriori approach for data mining to organize the


data items on a shelf using following table of items purchased in a
Mall

AIM:-

Implement Apriori approach for data mining

SOFTWARE REQUIRED:- Windows Operating Systems,Java


PREREQUSITES :- Basic knowledge for java programming.
OBJECTIVES:

Understand the importance Apriori Algorithm for finding frequent item sets
To learn mining association rules

THEORY:
Apriori Algorithm:
General Process Association rule generation is usually split up into two separate steps:
1. First, minimum support is applied to find all frequent itemsets in a database.
2. Second, these frequent itemsets and the minimum con fidence constraint are used to form
rules.
While the second step is straight forward, the firs t step needs more attention. Finding all
frequent itemsets in a database is difficult since it involves searching all possible item sets (item
combinations). The set of possible itemsets I s the power set over I and has size 2 n 1
(excluding the empty set which is not a valid itemset). Although t he size of the powerset grows
exponentially in the number of items n in I , efficient search is possible using the downwardclosure property of support (also called anti-monotonicity ) which guarantees that for a frequent
itemset, all its subsets are also frequent and thus for an infrequent itemset, all its supersets m ust
Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

also be infrequent. Exploiting this property, efficient algorithms (e.g., Apriori ) can find all f
requent itemsets.

Apriori Algorithm Pseudocode:


procedure Apriori (T, minSupport )
{ //T is the database and minSupport is the minimum support
L1= {frequent items};
for (k= 2; L k-1 != ; k++)
{
C k = candidates generated from L k-1 //that is cartesian product L k-1 x L k-1 and eliminating any k1 size itemset that is not
//frequent
for each transaction t in database do
{ #increment the count of all candidates in Ck that are contained in t
Lk= candidates in Ck with minSupport
}//end for each
}//end for
return UkLk;
}
Example:
Suppose you have records of large number of transactions at a shopping center as follows:

Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

Organize the data items on a shelf means finding the items that are purchased
together more
frequently than others. Apriori is the classic and probably the most basic algorithm to
do it.
Now, we follow a simple golden rule: we say an item/itemset is frequently bought if it is
bought at
least 60% of times(i.e Minimum Support=3). So for here it should be bought at least 3
times.
For simplicity M = Mango ,O = Onion , J=Jar, K= Key-chain, E=egg, C= Chocolate,
Co=Corn, A=Apple Kn=Knife and so on So the table becomes
Original table:

Transaction
ID

Items Bought

T1

{M, O, J, K, E, C}

T2

{N, O, J, K, E, C}

T3

{M, A, K, E}

T4

{M, T, Co, K, C}

T5

{Co, O, O, K, Kn, E}

Step 1: Count the number of transactions in which each item occurs, Note O=Onion is bought
4 times in total, but, it occurs in just 3 transactions.

Item

No of
transactions(Sup

Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

(Candidat
e SetC1 )

port)

Kn

Co

Step 2: Now we said the item is said frequently bought if it is bought at least 3 times. So in this
step we remove all the items that are bought less than 3 times from the above table and we are
left with
Item
(Frequent
Item Sets
L1)

Number of
transactions(Sup
port)

This is the single items that are bought frequently. Now lets say we want to find a pair of items
that are bought frequently. We continue from the above table (Table in step 2)
Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

Step 3: We start making pairs from the first item, like MO,MK,ME,MC and then we start with
the second item like OK,OE,OC. We did not do OM because we already did MO when we were
making pairs with M and buying a Mango and Onion together is same as buying Onion and
Mango together. After making all the pairs we get,

Item pairs
MO
MK
ME
MC
OK
OE
OC
KE
KC
EC
Step 4: Now we count how many times each pair is bought together. For example M and O is
just bought together in {M,O,N,K,E,C}
While M and K is bought together 3 times in { M, O, J, K, E, C }, { M, A, K, E } AND { M, T,
Co, K, C } After doing that for all the pairs we get

Item Pairs
(Candidate
set C2)

Number of
transactions
(Support)

MO
Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

MK

ME

MC

OK

OE

OC

KE

KC

EC

Step 5: Golden rule to the rescue. Remove all the item pairs with number of transactions less
than three and we are left with
Item Pairs
(Frequent
Item Sets L2)

Number of
transactions
(Support)

MK

OK

OE

KE

KY

These are the pairs of items frequently bought together.


Now lets say we want to find a set of three items that are brought together. We use the above
table (table in step 5) and make a set of 3 items.

Step 6: To make the set of three items we need one more rule (its termed as self-join),

Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

It simply means, from the Item pairs in the above table, we find two pairs with the same first
Alphabet, so we get
OK and OE, this gives OKE
KE and KC, this gives KEC
Then we find how many times O,K,E are bought together in the original table and same for
K,E,Y and we get the following table

Item Set
(Candidate
Set C3)

Number of
transactions
(Support)

OKE

KEY

While we are on this, suppose you have sets of 3 items say ABC, ABD, ACD, ACE, BCD and
you want to generate item sets of 4 items you look for two sets having the same first two
alphabets.

ABC and ABD -> ABCD

ACD and ACE -> ACDE

And so on In general you have to look for sets having just the last alphabet/item different.

Step 7: So we again apply the golden rule, that is, the item set must be bought together at least 3
times which leaves us with just OKE, Since KEY are bought together just two times.
Thus the set of three items that are bought together most frequently are :
Item Set L3={O, K, E} .

Frequent

CONCLUSION
Thus we have successfully implemented Apriori Approach for data mining to organize the data
items on a shelf using following table of items purchased in a Mall.

Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

FAQ
1. Define Frequent sets, confidence, support and association rule.
2. Explain whether association rule mining is supervised or unsupervised type of
learning.
3. What is Association rule?
Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

4. Define support and confidence .


5. What is the purpose of Apriori Algorithm?
6. Consider the Data set D. Given the minimum support2, apply apriori algorithm on
this dataset.
Transaction ID

Items

100

A,C,D

200

B,C,E

300

A,B,C,E

400

B,E

Experiment No: 12
Title: Implementation of K-NN approach take
sutaible Example

Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

Roll No:________
Batch:_____

EL-1

Class:_____

Date of Performance: ___ /___/_____


Date of Assessment : ___ /___/_____

Particulars
Attendance (05)
Journal (05)
Performance (05)
Understanding(05)
Total (20)
Signature of Staf
Member

Dept. of Computer Engg. (ZESs DCOER,Pune)

Marks

Computer Laboratory - I

EL-1

Experiment no 12

TITLE:AIM:-

Implementation of K-NN approach take suitable example.

Implement a K-NN approach

SOFTWARE REQUIRED:- Windows Operating Systems, JAVA


PREREQUSITES :- Basic knowledge of java programming.
OBJECTIVES:

Understand the importance of KNN in classification of data


To learn KNN Classification

THEORY:
K-NN Approach:
K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases
based on a similarity measure (e.g., distance functions). KNN has been used in statistical
estimation and pattern recognition. Nearest-neighbor classifiers are based on learning by
analogy, that is, by comparing a given test tuple with training tuples that are similar to it. The
training tuples are described by n attributes. Each tuple represents a point in an n-dimensional
space. In this way, all of the training tuples are stored in an n-dimensional pattern space. When
given an unknown tuple, a k-nearest-neighbor classifier searches the pattern space for the k
training
tuples that are closest to the unknown tuple. These k training tuples are the k nearest neighbors
of the unknown tuple.
Closeness is defined in terms of a distance metric, such as Euclidean distance.
The Euclidean distance between two points or tuples, say, X1 = (x11, x12, : : : , x1n) and
X2 = (x21, x22, : : : , x2n), is

Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

Algorithm:
1. Determine the parameter K = number of nearest neighbors beforehand. This value is all
up to you.
2. Calculate the distance between the query-instance and all the training samples. You can
use any distance algorithm.
3. Sort the distances for all the training samples and determine the nearest neighbor based
on the K-th minimum distance.
4. Since this is supervised learning, get all the Categories of your training data for the sorted
value which fall under K.
5. Use the majority of nearest neighbors as the prediction value.
Example:
An implementation of knn.
* Uses Euclidean distance
* Main method to classify if entry is male or female based on:
* Height, weight
Height

Weight

Class

175

80

Male

193.5

110

Male

163

110

Female

160

60

Female

1. Determine : New Data Height=170 Weight=60 Class=? K=3


2. Calculate the distance between the query instance and all the training samples

Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I

EL-1

Height

Weight

Distance

175

80

(175-170)2 +(80-60)2 = 425

193.5

110

(193.5-170)2+(110-60)2= 3052.25

163

110

(163-170)2+(110-60)2= 2549

160

60

(160-170)2+(60-60)2= 100

3. Sort the distance and determine nearest neighbors based on kth minimum distance
Height

Weight

Distance

Rank

Is it included in 3
nearest neighbors

175

80

425

Yes

193.5

110

3052.25

Yes

163

110

2549

No

160

60

100

Yes

4. Gather the category Y=Class of the nearest neighbors


Height
Weight
Distance
Rank Is it included

Category of the

in 3 nearest

nearest

neighbors

neighbors

175

80

425

Yes

Male

193.5

110

3052.25

No

Male

163

110

2549

Yes

Female

160

60

100

Yes

Female

5. Use simple Majority of the category of nearest neighbor as the prediction value of the
query instance
6. Out Put: For our query Height= 170 and weight=60 Class= Female

CONCLUSION
Dept. of Computer Engg. (ZESs DCOER,Pune)

Computer Laboratory - I
Thus we have successfully implemented KNN Approach for Classifying data.

FAQ
1. Define the concept of classification.
2. What is Decision Tree?
3. What is Attribute Selection Measure?
4. Describe Tree pruning methods.
Dept. of Computer Engg. (ZESs DCOER,Pune)

EL-1

Computer Laboratory - I
5. Explain the data mining functionalities?
6. Classification is supervised learning. Justify.
7. Explain different classification Techniques.

Dept. of Computer Engg. (ZESs DCOER,Pune)

EL-1

Вам также может понравиться