You are on page 1of 40

Big Data

Medical Imaging
Brett Cowan
Centre for Advanced MRI
University of Auckland

Medical Imaging

Overview

2/40

1. Medical imaging and Big Data


Medical imaging is producing huge quantities of complex and high
quality data what do we do with it all?

2. Big Data in action the Cardiac Atlas Project


International collaboration, data ownership, data sharing and
infrastructure

3. Data Analysis
New statistical analyses of shape, disease classifiers,
and new and improved diagnostic accuracy

4. A new approach to clinical trials?


Can we get more for less out of international clinical
trials?

Magnetic Resonance Imaging (MRI)

Medical Imaging

in just 20 years, MRI has revolutionised medical imaging


without radiation, without any known harmful effects and without even
touching the patient, MRI produces diagnostic images of virtually
photographic quality
the Auckland District Health Board PACS system has 15 TB on-line (one
year of imaging), 27 TB near-line (3 years of imaging) and 40 TB off-line
ten years ago, one year of imaging was <1 TB, now it is 15 TB

3/40

Medical Imaging

Faster Image Acquisition

1995
FLASH
(25seconds)

2005
SSFP
(14seconds)

4/40

2010
Accelerated
(5seconds)

2013
Realtime
(1second)

Medical Imaging

Neurological MRI

T1 weighted

5/40

T2 weighted

Medical Imaging

Tractography and fMRI

Tractography

6/40

fMRI

Data or Information or Knowledge?

Medical Imaging

in medical imaging, the data are usually grey scale pixel values
they are not demographics (52 years old), blood pressure (155/95),
a genetic sequence (ACAT), they are just numbers representing
grey scale (or colour) in an image
this data is not information in the sense of many other datasets
we must align images into a common reference frame, segment
features of interest, measure distances, thickness, volume and
define shapes or regions of interest
this process is time consuming relative to scan acquisition time (the
acquisition to analysis ratio)

7/40

Medical Imaging

Image Processing

Edge detection

8/40

Non-rigid registration

Feature tracking

a wide range of image processing techniques are used such as machine


learning, finite element modeling - and human interaction
these are computationally (or time) intensive

The Cardiac Atlas Project (CAP)

CAP Project
9/40

collecting data is expensive and it is reusable


the cardiac atlas project (CAP) is an international Big Data project
funded by the NIH, led from Auckland
subcontracts were awarded to collaborators at Johns Hopkins, UCLA,
and a Los Angeles supercomputing centre (Centre for Computational
Biology)

aim to collate cardiac (image) data from large international clinical


trials into a web accessible big data database for reuse by any
legitimate researcher

other aims were to create an infrastructure for managing approval


for data use . and to create advanced statistical analysis and display
tools

the first two contributing clinical trials were the MESA and DETERMINE
trials

Fonseca et al. Bioinformatics 27(16): 22882295; 2011

CAP Project

CAP Project Case Study

10/40

1. The overall rationale and strategy

2. Ownership of data

3. Project infrastructure

4. Data analysis and results

CAP Project

The Big Data Strategy

11/40

Data
Acquisition
RADIOLOGY

Heart
Modelling
BIOENGINEERING

Patient
Diagnosis

Statistical
Analysis

MEDICINE
BIG DATA

Software
Development
COMPUTER SCIENCE

Ownership and Rights to Big Data

CAP Project
12/40

Data Ownership Data Has Value

CAP Project
13/40

Who owns the rights to medical imaging trial data?


Who can use it and for what purpose?

Participant has rights, certainly they must provide informed consent,


informed in that they fully understand the risks and benefits, and what the
information will be used for
ethics committees will not usually give permission for the data to be
used by anyone for anything in the future
Researcher has rights, often jealously guarded as a strategic advantage
for publication and career progression
Institution has rights, but what is an individual leaves the University, do
they have the right to take all of the data with them?
Funder has rights, especially when they are a commercial entity such as
Big Pharma, or if there are valuable patents at stake

CAP Project

Ethical Approval (IRB)

14/40

Individual consent
required
Application to IRB
required
Investigator can
make the decision
Not human
subjects research

Low

High
IRB requirements

HIPAA and Anonymisation of Metadata

CAP Project

the convenience and power of electronic data is also its Achilles heel
the DICOM standard allows for the inclusion of private information,
which is not HIPAA (1996) compliant in many cases.

15/40

Project Infrastructure

CAP Project
16/40

Database

CAP Project
17/40

Calculation of Volume and Mass

CAP Project
18/40

Complete mathematical
representation of the left ventricle

Creation of a Mathematical Model

CAP Project
19/40

Using image processing (and operator input), the raw images are converted
into a beating mathematical heart in the database. This allows any parameter
to be determined without further analysis.

CAP Project

Reproducibility and Accuracy

20/40

Accuracy

Scanrescan variability

12 animals (9 dogs, 3 pigs)

Scanned twice at a six week interval

Data courtesy David Fieno and Paul Finn

Coefficient of variation 3%

LVM determined by weight at autopsy

LVM difference -1.1 5.7 g (~ 0.6%)

LVM difference 2.1 4.3 g (~3%)

Difference (g)

Difference (g)

25 patients with moderate to severe MR

Average LVM (g)

Postmortem (g)

CAP Project

Segmentation Challenge

21/40

(a) Basal slice

(b) M id-vent ricular slice

(c) A pical slice

Suinesiaputra et al. Medical Image Analysis In press 2013

Modal Analysis

CAP Project
22/40

Lewandowski et al. Circulation 2013;127:197-206

CAP Project

Identification of Myocardial Infarction

Anterior

Lateral

Septal

23/40

Medrano-Gracia et al. JCMR 15:80 ; 2013

Modeling of Stiffness, Stress and Strain

CAP Project
24/40

Normal

Non-Ischaemic HF

Vicky Wang, Martyn Nash, STACOM

CAP Project

Classification of Disease

25/40

Does the patient have the disease??

Medrano-Gracia, PhD thesis, 2013

CAP Project

Data Analysis

26/40

How bad is the disease??

Medrano-Gracia, PhD thesis, 2013

A Second Example the Coronary Arteries

CAP Project
27/40

Image data

Clinical problem

Database

Statistics

Catalonia

Prospective trials
28/40

Cardiovascular Risk in Catalonia

Prospective trials
29/40

If we wanted to determine the cardiovascular risk profile in Catalonia,


how would we do this?

Specific

1. Recruit 5,000 normal Catalonians (preferably in


1948) and follow them for 50 years (similar to the
definitive Framingham study)

High cost

2. Recruit 5,000 normal Catalonians and follow them


for five years (an abbreviated Framingham study)
3. Use the Framingham results and add local
correction factors from small studies where there
are obvious discrepancies
4. Read the Framingham publications and speculate
on how the data applies locally
Generic

Low cost

The VERIFICA Study

Prospective trials

The Framingham function adapted to local population characteristics


accurately and reliably predicted the 5-year CHD risk for patients
aged 3574 years, in contrast with the original function, which
consistently overestimated the actual risk.
about 60% was observed in the United Kingdom; however, this is
far from the >260% overestimation observed in Spain

30/40

The Catalonian Risk Table

Prospective trials
31/40

www.regicor.org

Prospective trials

Cost-Benefit Ratio of Clinical Trials

32/40

Cost
Notfundable

Fundable

Benefit

NZSpecificdata

Trialperformedin
NewZealand

Canwedobetterin
thisrange?

Genericdata

Trialperformed
Overseas

The MESA Trial

Prospective trials

The Multi-Ethnic Study of Atherosclerosis (MESA) is a study of the


characteristics of subclinical cardiovascular disease and the risk factors
that predict progression to clinical disease. 6,814 asymptomatic men and
women aged 45-84 have been recruited. (38% are white, 28% AfricanAmerican, 22% Hispanic, and 12% Asian).

33/40

MESA Investigations

Prospective trials
34/40

extensive physical exam to determine coronary calcification


ventricular mass and function by MRI
flow-mediated endothelial vasodilation
carotid intimal-medial wall thickness and presence of echogenic lucencies
in the carotid artery
lower extremity vascular insufficiency
arterial wave forms
electrocardiographic (ECG) measures
standard coronary risk factors
socio-demographic factors
lifestyle factors, and psychosocial factors
blood samples are being assayed for putative biochemical risk factors and
stored for case-control studies
DNA are being extracted and lymphocytes immortalized for study of
candidate genes and possibly, genome-wide scanning
participants are being followed for identification and characterization of
cardiovascular disease events, including acute myocardial infarction and
other forms of coronary heart disease (CHD), stroke, and congestive heart
failure; for cardiovascular disease interventions; and for mortality

The Jackson Heart Study

Prospective trials

The objective of the Jackson Heart Study is to investigate the causes


of cardiovascular disease (CVD) in African Americans (n=5301) with
an emphasis on manifestations related to hypertension (such as
remodeling of the left ventricle of the heart, coronary artery disease,
heart failure, stroke and renal vascular disease).

The MESA and Jackson Heart Studies are using software developed in
Auckland to analyse all of their cardiac MRI image data. The data and
results are fully compatible with all of the work already done here.

35/40

New Zealand Fingerprinting

Prospective trials
36/40

could we perform the baseline MESA


investigations on a group of 100 participants in
New Zealand?
this group could be defined geographically, by
age, gender, a specific risk factor, .
this would provide a mean and standard deviation
for each investigation
there would also be a profile for each individual
participant
together these data would represent a fingerprint
for this group in New Zealand
it is highly likely that some participants (and
groups of participants) in MESA will have a similar
group (and individual) fingerprint
could we then follow them in the MESA trial?

Prospective trials

Matching

37/40

Fingerprinting

Matching

n =100
n =1000

Prospective trials

Future of Clinical Trials


Overall

OutcomesforNewZealand
datawhichreflectsNZsubgroups
directapplicationofresults
amplificationofsamplesizeby10X
trialcostmetinternationally
prospectivestudydesign
internationalcollaborationand
engagement
developmentoffingerprintingand
matchingtechnologies

NewZealand
Sample(n=100)

Fingerprint

Local(NewZealand)

38/40

20year
outcomes
15year
outcomes
10year
outcomes

NZcohort
followedin
trial

5year
outcomes

Statistical
matching

NZcohortidentifiedin
baselinedata(n=1000)

Global

Summary

Big Data

39/40

What is big data for us in Medical Imaging?


it is large and expanding medical imaging databases, preferably shared
internationally
there are issues of data ownership, appropriate ethical consents, data
anonymisation and access control
computationally intensive image processing is required to convert data
into useful information
computational and statistical anatomy and pathology (rather than feature
labeling, or calculation of simple distance or volumes)
image data may be represented using mathematical models to recreate
physiological and pathophysiological shape, features and motion
statistical analyses of shape, disease classifiers, calculation of new
parameters (like stress) become possible
clinical trial focus devices and pharmaceuticals

Summary

Thank you to -

40/40

Alistair Young

Avan

Michael

Jae Do

Carissa

Lana

Agustn

Ben

Yingmin

Pau

Wenchao

Paul Finn

Randall

John