Добро пожаловать в Scribd!

Gowtham

Загружено:

0% нашли этот документ полезным (1 голос)

16 просмотров18 страниц

Data mining is a process of extracting the data from the huge amount of databases. It is a relatively new concept which was emerged in the late 1980s. Data mining has attracted a great deal of attention in the information industry and in society as a whole in recent years.

Исходное описание:

Оригинальное название

gowtham

Авторское право

Доступные форматы

PPT, PDF, TXT или читайте онлайн в Scribd

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Пожаловаться на этот документ

Авторское право:

Attribution Non-Commercial (BY-NC)

Доступные форматы

Скачайте в формате PPT, PDF, TXT или читайте онлайн в Scribd

Отметить как неприемлемый контент

0% нашли этот документ полезным (1 голос)

16 просмотров18 страниц

Gowtham

Загружено:

Shiva Ram

Авторское право:

Attribution Non-Commercial (BY-NC)

Доступные форматы

Скачайте в формате PPT, PDF, TXT или читайте онлайн в Scribd

Отметить как неприемлемый контент

Перейти к странице

Вы находитесь на странице: 1из 18

Поиск в документе

Seminar by, Shiva Rama Krishna.

Introduction to Data mining Exploring an example on Frequent patterns, Associations and Correlation Exploring an example on Clustering Analysis

Data mining is a fairly new concept which was emerged in the late 1980s. But it soon attracted huge interests for research works and flourishes with many new and remarkable techniques being discovered throughout the 1990s.

Data mining is a process of extracting the data from the huge amount of databases.

Data mining has attracted a great deal of attention in the information industry and in society as a whole in recent years, due to the wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge.

Relational databases, Transactional databases, Advanced database systems, Flat files,

Data streams, and

The World wide web, Advanced database systems include object-relational databases and specific application-oriented databases, such as spatial data bases, time-series databases, text databases, and multimedia databases.

What is Frequent patterns

Example on Transactional data of Customers to a shop

Frequent patterns: These are the patterns that appear in a data set
frequently.

for example, a set of items, such as A=apple, B=bread, C=cheese, D=drink, E=eggs, that appear frequently together in a transaction data set is a frequent item-set. Example: TID
T100
T200 T300 T400 T500 T600 T700 T800 T900

(through apriory Algorithm) List of item_IDs

A,B,E
B,D B,C A,B,D A,C B,C A,C A,B,C,E A,B,C

take minimum support as 2

if we take minimum support count as 2

Scan D for count of each candidate

C1
Item set Sup. Count

{A} {B} {C} {D} {E}

6 7 6 2 2

Compare candidate support count with minimum support count

L1
Item set {A} {B} {C} {D} {E} Sup. Count 6 7 6 2 2
Generate C2 candidates fromL1

C2
Item-set
{A, B} {A, C} {A, D} {A,E} {A, E} {B, C} {B, D} {B, E} {C, D} {C,E} {D, E}
Scan D for count of each candidate

C2
Item set {A, B} {A, C} {A, D} {A, E} {B, C} {B, D} {B, E} {C, D} {C,E} {D, E} Sup. Count 4 4 1 2 4 2 2 0 1 0
Compare candidate support count with minimum support count

L2
Item set {A, B} {A, C} {A, E} {B, C} {B, D} {B, E} Sup. Count 4 4 2 4 2 2

Generate C3 Candidate from L2

C3
Item set {A, B, C} {A, B, E}

C3
Scan D for count of each candidate

Item set {A, B, C} {A, B, E}

Sup. Count 2 2

Compare candidate support count with minimum support count

L3
Item set {A, B, C} {A, B, E} Sup. Count 2 2

What is Cluster Analysis? Finding the dissimilarity between two binary variables and example on it

Clustering:
The process of grouping a set of physical or abstract objects into classes of similar objects is called clustering. A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters.
Cluster analysis has been widely used in numerous applications, including market research, patterns recognition, data analysis, and image processing. In business, clustering can help marketers discover dispurchasing patterns.

Analysis of dissimilarity through Cluster analysis:

Object j
1 1 object i 0 q s 0 r t sum q+r s+t

q is the number of variables that equals to 1 for both the objects

sum

q+s

r+t

A contingency table for binary variables Formula for calculating dissimilarity between i and j: d (i, j) = (r+s)/(q+r+s+t) Sim(i.j)=(q/(q+r+s))=1-d(i,j) (t is ignored)

A relational table3 where patients are described by binary attributes: Name Jack Mary gender M F fever P P cough N N test-1 P P test-2 N N test-3 N P test-4 N N

Jim

Here, N

(negative), P

positive

(P set to be 1 and N be set to be 0)

SYMPTOMS

JACK

MARY
SYMPTOMS

JACK

JIM
SYMPTOMS

MARY

JIM

1 COUGH 0 TEST1 1 TEST2 0 TEST3 0 TEST4 0

FEVER
Calculation:

1 0 1 0 1 0

FEVER COUGH

TEST1
TEST2 TEST3

TEST4

1 0 1 0 0 0

1 1 0 0 0 0

FEVER COUGH TEST1 TEST2 TEST3

TEST4

1 0 1 0 1 0

1 1 0 0 0 0

d(jack,mary) = (0+1)/(2+0+1)=0.33
d(Jack,Jim) = (1+1)/(1+1+1)=0.67 d(Mary,Jim) = (1+2)/(1+1+2)=0.75

Вам также может понравиться

Wipro Profile
Документ15 страниц
Wipro Profile
Shiva Ram
Оценок пока нет
(WWW - Entrance-Exam - Net) - TCS Placement Sample Paper 2
Документ13 страниц
(WWW - Entrance-Exam - Net) - TCS Placement Sample Paper 2
Lakshmi Pujari
Оценок пока нет
(WWW - Entrance-Exam - Net) - TCS Placement Sample Paper 2
Документ13 страниц
(WWW - Entrance-Exam - Net) - TCS Placement Sample Paper 2
Lakshmi Pujari
Оценок пока нет
Tcs Paper Pattern 123
Документ23 страницы
Tcs Paper Pattern 123
Vivek Chand
60% (5)
Wipro 2011
Документ41 страница
Wipro 2011
Shiva Ram
Оценок пока нет
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
От Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Рейтинг: 4 из 5 звезд
4/5 (5794)
The Little Book of Hygge: Danish Secrets to Happy Living
От Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Рейтинг: 3.5 из 5 звезд
3.5/5 (399)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
От Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Рейтинг: 3.5 из 5 звезд
3.5/5 (231)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
От Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Рейтинг: 4 из 5 звезд
4/5 (894)
The Yellow House: A Memoir (2019 National Book Award Winner)
От Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Рейтинг: 4 из 5 звезд
4/5 (98)
Shoe Dog: A Memoir by the Creator of Nike
От Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Рейтинг: 4.5 из 5 звезд
4.5/5 (537)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
От Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Рейтинг: 4.5 из 5 звезд
4.5/5 (474)
Never Split the Difference: Negotiating As If Your Life Depended On It
От Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Рейтинг: 4.5 из 5 звезд
4.5/5 (838)
Grit: The Power of Passion and Perseverance
От Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Рейтинг: 4 из 5 звезд
4/5 (587)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
От Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Рейтинг: 4.5 из 5 звезд
4.5/5 (265)
Yes Please
От Everand
Yes Please
Amy Poehler
Рейтинг: 4 из 5 звезд
4/5 (1891)
Angela's Ashes: A Memoir
От Everand
Angela's Ashes: A Memoir
Frank McCourt
Рейтинг: 4.5 из 5 звезд
4.5/5 (440)
The Emperor of All Maladies: A Biography of Cancer
От Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Рейтинг: 4.5 из 5 звезд
4.5/5 (271)
On Fire: The (Burning) Case for a Green New Deal
От Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Рейтинг: 4 из 5 звезд
4/5 (73)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
От Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Рейтинг: 4.5 из 5 звезд
4.5/5 (344)
Team of Rivals: The Political Genius of Abraham Lincoln
От Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Рейтинг: 4.5 из 5 звезд
4.5/5 (234)
Fear: Trump in the White House
От Everand
Fear: Trump in the White House
Bob Woodward
Рейтинг: 3.5 из 5 звезд
3.5/5 (738)
The Glass Castle: A Memoir
От Everand
The Glass Castle: A Memoir
Jeannette Walls
Рейтинг: 4.5 из 5 звезд
4.5/5 (1712)
Rise of ISIS: A Threat We Can't Ignore
От Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Рейтинг: 3.5 из 5 звезд
3.5/5 (137)
Principles: Life and Work
От Everand
Principles: Life and Work
Ray Dalio
Рейтинг: 4 из 5 звезд
4/5 (599)
The Unwinding: An Inner History of the New America
От Everand
The Unwinding: An Inner History of the New America
George Packer
Рейтинг: 4 из 5 звезд
4/5 (45)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
От Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Рейтинг: 3.5 из 5 звезд
3.5/5 (2219)
Steve Jobs
От Everand
Steve Jobs
Walter Isaacson
Рейтинг: 4.5 из 5 звезд
4.5/5 (806)
John Adams
От Everand
John Adams
David McCullough
Рейтинг: 4.5 из 5 звезд
4.5/5 (2409)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
От Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Рейтинг: 4 из 5 звезд
4/5 (1090)
Bad Feminist: Essays
От Everand
Bad Feminist: Essays
Roxane Gay
Рейтинг: 4 из 5 звезд
4/5 (1015)
The Outsider: A Novel
От Everand
The Outsider: A Novel
Stephen King
Рейтинг: 4 из 5 звезд
4/5 (1839)
Brooklyn: A Novel
От Everand
Brooklyn: A Novel
Colm Toibin
Рейтинг: 3.5 из 5 звезд
3.5/5 (1937)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
От Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Рейтинг: 4.5 из 5 звезд
4.5/5 (119)
A Man Called Ove: A Novel
От Everand
A Man Called Ove: A Novel
Fredrik Backman
Рейтинг: 4.5 из 5 звезд
4.5/5 (4609)
The Light Between Oceans: A Novel
От Everand
The Light Between Oceans: A Novel
M.L. Stedman
Рейтинг: 4.5 из 5 звезд
4.5/5 (789)
The Woman in Cabin 10
От Everand
The Woman in Cabin 10
Ruth Ware
Рейтинг: 3.5 из 5 звезд
3.5/5 (2322)
Manhattan Beach: A Novel
От Everand
Manhattan Beach: A Novel
Jennifer Egan
Рейтинг: 3.5 из 5 звезд
3.5/5 (792)
The Perks of Being a Wallflower
От Everand
The Perks of Being a Wallflower
Stephen Chbosky
Рейтинг: 4.5 из 5 звезд
4.5/5 (2099)
Wolf Hall: A Novel
От Everand
Wolf Hall: A Novel
Hilary Mantel
Рейтинг: 4 из 5 звезд
4/5 (3811)
Little Women
От Everand
Little Women
Louisa May Alcott
Рейтинг: 4 из 5 звезд
4/5 (104)
The Art of Racing in the Rain: A Novel
От Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Рейтинг: 4 из 5 звезд
4/5 (4200)
Sing, Unburied, Sing: A Novel
От Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Рейтинг: 4 из 5 звезд
4/5 (1103)
A Tree Grows in Brooklyn
От Everand
A Tree Grows in Brooklyn
Betty Smith
Рейтинг: 4.5 из 5 звезд
4.5/5 (1929)
The Constant Gardener: A Novel
От Everand
The Constant Gardener: A Novel
John le Carre
Рейтинг: 3.5 из 5 звезд
3.5/5 (104)
Her Body and Other Parties: Stories
От Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Рейтинг: 4 из 5 звезд
4/5 (821)
CSE305 L8 10 10 11 Mon
Документ32 страницы
CSE305 L8 10 10 11 Mon
MD. JUWEL MALLICK
Оценок пока нет
DMS Unit 1 22319-1
Документ17 страниц
DMS Unit 1 22319-1
Nisarg Gugale
Оценок пока нет
How To Estimate The Size of An RMAN Database Backup (Doc ID 1274720.1)
Документ2 страницы
How To Estimate The Size of An RMAN Database Backup (Doc ID 1274720.1)
Jheyner López Hurtado
Оценок пока нет
Exploring Lakehouse 751615 NDX
Документ32 страницы
Exploring Lakehouse 751615 NDX
amir hayat
Оценок пока нет
File Types
Документ2 страницы
File Types
mryalamrb
Оценок пока нет
11
Документ13 страниц
11
Manish Madhav
Оценок пока нет
19bcd7246 Assignment4 L27+L28+L31+L32
Документ10 страниц
19bcd7246 Assignment4 L27+L28+L31+L32
Sriharshitha Deepala
Оценок пока нет
PassLeader-F5-101-Exam-Dumps-560 Q&As
Документ150 страниц
PassLeader-F5-101-Exam-Dumps-560 Q&As
Vinee Pahuja
Оценок пока нет
8051 Architecture (Enhanced) OctY09
Документ59 страниц
8051 Architecture (Enhanced) OctY09
Santhoshreddy Gogula
Оценок пока нет
RFP Template Provides Security Checklist for Vendor Projects
Документ5 страниц
RFP Template Provides Security Checklist for Vendor Projects
manishcsap3704
Оценок пока нет
Analyzing ICMP traffic between hosts in Mininet and remote server
Документ13 страниц
Analyzing ICMP traffic between hosts in Mininet and remote server
Jeelon MT
Оценок пока нет
MGW Ping
Документ7 страниц
MGW Ping
Asif Shafi
Оценок пока нет
AX5000-1024 Datasheet
Документ5 страниц
AX5000-1024 Datasheet
solution regional5
Оценок пока нет
D365 Security and Compliance Guide PDF
Документ62 страницы
D365 Security and Compliance Guide PDF
Karla Cerrato
Оценок пока нет
Week Five Assignment Database Modeling and Normalization
Документ9 страниц
Week Five Assignment Database Modeling and Normalization
Evans Oduor
Оценок пока нет
Introduction Stata Slides 2
Документ25 страниц
Introduction Stata Slides 2
carelessman
Оценок пока нет
I.mx Linux User's Guide
Документ56 страниц
I.mx Linux User's Guide
Vageesh KM
100% (1)
HSPA
Документ7 страниц
HSPA
mola argaw
Оценок пока нет
Luna EFT Japan PIN User Guide - PN007-012066-001 - RevB
Документ7 страниц
Luna EFT Japan PIN User Guide - PN007-012066-001 - RevB
fiqur1
Оценок пока нет
Institute of Engineering & Technology Davv Database Management Systems Assignment-1
Документ2 страницы
Institute of Engineering & Technology Davv Database Management Systems Assignment-1
Naman Gupta
Оценок пока нет
HCIA-Data Center Examen
Документ14 страниц
HCIA-Data Center Examen
Antonio Agustin Caceres Ferreira
Оценок пока нет
Created by .. Shatakshi R
Документ23 страницы
Created by .. Shatakshi R
Shatakshi Rampure
Оценок пока нет
Export Report To PDF
Документ2 страницы
Export Report To PDF
Jim
Оценок пока нет
SPA Bus Communication Server: Product Data Sheet
Документ2 страницы
SPA Bus Communication Server: Product Data Sheet
Trần Đình
Оценок пока нет
iSeries Administrator Exam 000-351 Study Guide
Документ38 страниц
iSeries Administrator Exam 000-351 Study Guide
Studentfori
Оценок пока нет
EC8691 Microprocessors and Microcontroll
Документ12 страниц
EC8691 Microprocessors and Microcontroll
Mr Perfect
Оценок пока нет
Online Billing and Invoice System Synopsis
Документ3 страницы
Online Billing and Invoice System Synopsis
VAIBHAVI VISHE
Оценок пока нет
Arcsight Logger 3.0 Training: Workshop Course Outline
Документ4 страницы
Arcsight Logger 3.0 Training: Workshop Course Outline
strokenfilled
Оценок пока нет
Arduino Uno r3 Datasheet
Документ9 страниц
Arduino Uno r3 Datasheet
Andreea Ilie
Оценок пока нет
Kubernetes Controllers - The Kubernetes Workshop
Документ70 страниц
Kubernetes Controllers - The Kubernetes Workshop
OLALEKAN ALEDARE
Оценок пока нет