Академический Документы
Профессиональный Документы
Культура Документы
graham.dudgeon@mathworks.com
Overview
Examples: SQL database connection, URL file reading and reading multiple files
Write and test functions that read in industry specific text file formats
Organize multiple data sets into single data containers using Data Tables Visualize and interact with data Automate the detection and classification of events
Summary
2
Knowledge
Understanding
Analytics Frequency & Time-domain Predictive Analytics Extrapolation
Information
Organization
Filtering Signal Analysis Data Reduction Plotting
Data
Physical Sensors
Database Access
Financial Data ODBC JDBC HDFS (Hadoop)
Knowledge
File I/O
Understanding
Analytics Frequency & Time-domain Predictive Analytics Extrapolation Text Spreadsheet XML CDF/HDF Image Audio Video Geospatial Web content
Information
Organization
Filtering Signal Analysis Data Reduction Plotting
Hardware Access
Data acquisition Image capture GPU Lab instruments
Data
Communication Protocols
CAN (Controller Area Network) DDS (Data Distribution Service) OPC (OLE for Process Control) XCP (eXplicit Control Protocol) 4
Physical Sensors
Data Processing
Convert, Sync, Clean, Reduce
Knowledge
Understanding
Analytics Frequency & Time-domain Predictive Analytics Extrapolation
Information
Organization
Filtering Signal Analysis Data Reduction Plotting
Data
Physical Sensors
Visualization
Knowledge
Understanding
Analytics Frequency & Time-domain Predictive Analytics Extrapolation
Information
Organization
Filtering Signal Analysis Data Reduction Plotting
Data
Physical Sensors
Exploratory Analysis
Derived metrics, events, conditions
Knowledge
Understanding
Displacement Acceleration
40
MPG
20 20 10 400 200
Information
Organization
Filtering Signal Analysis Data Reduction Plotting
Weight Horsepower
Data
40
10
20
200
400 2000
4000
50 100150200
Horsepow er
Acceleration
Displacement
Weight
Physical Sensors
16
Classification
Taking an example of dynamic responses, classify the responses automatically into an appropriate category Create a categorical array to allow logical indexing on a categorical basis
unstable stable
neutral
17
Text File
Header Section n
Data Section n
19
line =
~A DEPTH ILM ILD
20
ans =
'~A' 'DEPTH' 'ILM' 'ILD' >> col_heads = col_heads{:}(2:end); % strip off the '~A';
21
end
data1.(col_heads{1}) data1.(col_heads{2}) data1.(col_heads{3}) data1.DEPTH data1.ILM data1.ILD
23
Condition a Line of Text that Contains Different Delimiters and Different Substring Identifiers (1)
Files may contain combinations of delimiters that serve the same purpose, such as whitespace, tab or comma to separate column entries. There may also be substrings that are enclosed by unique substring identifiers
9 1, 10.000 10 , 1 80.000 ,
NAME9" 0.000, 0.000 1 ' BUS09 ' NAME10' 0.000,, 1 ' BUS10
9 1 10.000 10 1 80.000
Condition a Line of Text that Contains Different Delimiters and Different Substring Identifiers (2)
Use regular expression replacement to identify and replace delimiters and add characters as appropriate. >> str1 = regexprep(str1,',\s*,',', 0.000 ,'); 9 1, 10.000 10 , 1 80.000 , NAME9" 0.000, 0.000 1 ' BUS09 ' NAME10' 0.000,, 1 ' BUS10
9 1, 10.000 10 , 1 80.000 ,
NAME9" 0.000, 0.000 1 ' BUS09 ' NAME10' 0.000, 0.000 , 1 ' BUS10
25
Condition a Line of Text that Contains Different Delimiters and Different Substring Identifiers (3)
Use regular expressions to identify substrings and sprintf to replace the substring with a conditioned version. >> [start_idx,end_idx] = regexp(str2,'"\s*\w*\s*"'); 9 1, 10.000 10 , 1 80.000 , NAME9" 0.000, 0.000 1 ' BUS09 ' NAME10' 0.000, 0.000 , 1 ' BUS10
9 1, 10.000 10 , 1 80.000 ,
NAME9 0.000, 0.000 1 ' BUS09 ' NAME10' 0.000, 0.000 , 1 ' BUS10
26
Key A 1 4 7 9
B
1.1 1.4 1.7
Key 1
B
1.1 NaN 1.4 NaN 1.7 1.9
Y
0.1 0.3 NaN 0.5 0.7 NaN
Z
0.2 0.4 NaN 0.6 0.8 NaN
3
4 Z
0.2 0.4 0.6
Key X 1 3 5 7
Y
0.1 0.3 0.5
5 7 9
28
List-wise deletion
Unbiased estimates Reduces sample size
Implementation options
Built in to many MATLAB functions Manual filtering
29
Easy to model
Summary
Examples: SQL database connection, URL file reading and reading multiple files
Write and test functions that read in industry specific text file formats
Organize multiple data sets into single data containers using Data Tables Visualize and interact with data Automate the detection and classification of events
31
32