Академический Документы
Профессиональный Документы
Культура Документы
SLIDE 0
Overview
SLIDE 1
Healthcare (Big?) Data
SLIDE 2
Healthcare (Big?) Data
SLIDE 3
The Dream for Big Data in Healthcare
SLIDE 4
The Data Science Platform
Kerberized
SLIDE 5
Data Science Journey
Step 1:
Get Data
Step 2: ?
Step 3:
Profit
SLIDE 6
Creating Data Science Questions
SLIDE 7
Clinical Laboratory Use Case
SLIDE 8
General Problem Definition
Can we use big data and machine learning to improve our diagnostic reproducibility and
throughput for laboratory testing?
Hypothesis: The use of a ML algorithm and real-time data processing pipeline will improve
our turnaround time and interpretive reproducibility. Quantitative results will improve
diagnostic accuracy.
SLIDE 9
Current Data Acquisition Complete Blood Count (CBC) Example
S L I D E 10
Goal Definition - Peripheral Blood Smear
Current
Frequent dacrocytes and occasional target cells, which can be
seen in iron deficiency anemia. Platelets and leukocytes
unremarkable.
Future
Normal RBCs: 95.5% (Low)
Dacrocytes: 2.7% (High)
Target cells: 1.2% (High)
Elliptocytes: 0.5% (Normal)
Schistocytes: 0.1% (Normal)
S L I D E 11
Architecting a Pipeline
Ingestion
- HDF, Flume, Storm
Analysis
- Spark, Python
S L I D E 12
Data Acquisition Technology Assessment
S L I D E 13
Data Analysis Technology Assessment
Spark Python
Scalability Less scalable
Easy to develop / test More difficult to integrate existing data
Great integration with existing data sets Great ML support with ability to use GPU
Easily tested locally then deployed within our
Less mature ML frameworks cluster-enabled Docker Swarm
Minimal GPU support
S L I D E 14
The Tech Stack
S L I D E 15
Advanced Analytics (Machine Learning!)
Normal Erythrocyte
S L I D E 16
Data Science!
100.0%
80.0%
80.0%
60.0%
60.0%
40.0%
20.0% 40.0%
1 2 3
Predicted
Training Cells Precision normal echinocyte dacrocyte schistocyte elliptocyte acanthocyte target cell stomatocyte spherocyte total
normal 141 0 0 0 0 0 0 1 0 142
echinocyte 0 49 0 1 0 0 0 0 0 50
dacrocyte 2 0 10 1 0 0 0 0 1 14
Actual
schistocyte 0 0 1 121 0 0 0 0 0 122
elliptocyte 0 0 0 2 9 0 0 0 0 11
acanthocyte 0 0 0 3 0 19 0 0 0 22
target cell 0 0 0 0 0 0 140 0 0 140
stomatocyte 0 0 0 0 0 0 0 14 0 14
spherocyte 0 0 0 0 0 0 0 1 22 23
Data from: Durant, Olsen, and Torres 2016 total 143 49 11 128 9 19 140 16 23 538
S L I D E 17
Conclusions: Advanced Healthcare Analytics
S L I D E 18
Take Home Clinical Image Analysis
S L I D E 19
Questions?
S L I D E 20