Академический Документы
Профессиональный Документы
Культура Документы
Name: Shan Parvej Gayen Roll No :-069123 Department Of MtechCSE Heritage Institute Of Technology
5 April 2012
LEARNING OBJECTIVES
5 April 2012
Understand the concept of Spatial Data Mining Describe the concepts of patterns and SDM Describe the motivation for SDM Learn about patterns explored by SDM Learn techniques on how to find spatial patterns
5 April 2012
Modern Examples
Cancer clusters to investigate environment health hazards Crime hotspots for planning police patrol routes Bald eagles nest on tall trees near open water Nile virus spreading from north east USA to south and west Unusual warming of Pacific ocean (El Nino) affects weather in USA
5 April 2012
Random, haphazard, chance, stray, accidental, unexpected. Without definite direction, trend, rule, method, design, aim, purpose.
A frequent arrangement, configuration, composition, regularity. A rule, law, method, design, description. A major direction, trend, prediction.
4
What is a Pattern?
5 April 2012
Large search space of plausible hypothesis Ex. Asiatic cholera : causes water, food, air, insects.
Ex. Shutting off identified water pump => saved human lives. Ex. Water pump Cholera connection lead to the germ theory.
5
5 April 2012
Finding neighbors of Canada given names and boundaries of all countries (Search space not large)
Heavy rainfall in Minneapolis is correlated with heavy rainfall in St. Paul (10 miles apart). Common knowledge, nearby places have similar rainfall Diaper sales and beer sales are correlated in evenings
5 April 2012
Location Prediction:
Spatial Interactions
Hot spot
LOCATION PREDICTION
5 April 2012
Where will a phenomenon occur? Which spatial events are predictable? How can a spatial event be predicted from other spatial events? Examples
Where will an endangered bird nest? Which areas are prone to fire given maps of vegitation and drought? What should be recommended to a traveler in a given location?
8
SPATIAL INTERACTIONS
5 April 2012
Which spatial events are related to each other? Which spatial phenomena depend on other phenomenon? Examples
Earth science:
Epidemiology:
HOT SPOTS
5 April 2012
Is a phenomenon spatially clutered? Which spatial entities are unusual or share common characteristics? Examples
Crime hot spots to plan police patrols Cancer cluster to learn investigations
10
SPATIAL QUERIES
5 April 2012
Find all cities within 50 miles of Paris Query has associated region (location, boundary) Answer includes overlapping or contained data regions
Find the 10 cities nearest to Paris Results must be ordered by proximity Find all cities near a lake Join condition involves regions and proximity.
Nearest-Neighbor Queries
5 April 2012
Items in a traditional data are independent of each other, where as properties of location in a map are often auto-correlated (patterns exist) Traditional data deals with simple domains, e.g. numbers and symbols where as spatial data types are complex Items in traditional data describe discrete objects where as spatial data is continuous
12
ASSOCIATION RULES
5 April 2012
Support = the number of time a rule shows up in a database Confidence = Conditional probability of Y given X Example
(Bedrock type = limestone), (soil depth < 50 ft) => (sink hole risk = high) Support = 20 %, confidence = 0.8 Interpretation: Locations with limestone bedrock and low soil depth have high risk of sink hole formation.
13
5 April 2012
Key challenge
Key assumption
Key insight
14
15
5 April 2012
5 April 2012
Classical method
Association rules given item types and transactions Assumes spatial data can be decomposed into transactions
16
17
5 April 2012
18
5 April 2012
CO-LOCATION RULES
5 April 2012
For point data in space Does not need transaction, works directly with continuous space Use neighborhood definition and spatial joins
19
CO-LOCATION RULES
5 April 2012
Co location
20
CLUSTERING
5 April 2012
Spatial view: rows in a database = points in a multidimentional space. Visualization may reveal interesting groups
21
CLUSTERING
5 April 2012
Hierarchical
All points in one cluster Split and merge till a stop criterion is reached
Start with random central point Assign points to nearest central point Update the central points Approach with statistical rigor Find clusters based on density of regions
Partitional
Density
22
OUTLIERS
5 April 2012
Observations inconsistent with rest of the dataset Observations inconsistent with their neighborhoods A local instability or discontinuity
23
VARIOGRAM CLOUD
5 April 2012
Create a variogram by plotting attribute difference, distance for each pair of points Select points common to many outlying pairs
24
Plot normalized attribute values, weighted average in the neighborhood for each location Select points in upper left and lower right quadrant
25
5 April 2012
SCATTER PLOT
Plot normalized attribute values, weighted average in the neighborhood for each location Fit a liner regression line Select points which are unusually far from the regression line.
26
5 April 2012
CONCLUSION
5 April 2012
Location prediction Feature interaction Hot spot Techniques like associations, clustering and outlier detection
27