Вы находитесь на странице: 1из 34

Scien&c

Best-Prac&ces for Recurring


Problems in Computer Security R & D
Daniel Bilar
Director of Research, Siege Technologies
Manchester, New Hampshire, USA
dbilar@acm.org
@daniel_bilar

SyScan 2014
Singapore
April 3rd, 2014

Outline of Talk
Dynamic SyScan update: Car CAN R & D
Signals/ Side Channels
Side Channels & Methods
Detec&on
Representa&on
Analysis

Case studies
ROPe, Bochspwn, Cyber Mission Planning

Epilogue/Blueprint

Thanks for help/


inspira&on and
apprecia&on to

Thomas Dullien
Ero Carrera
Alex So-rov
Travis Goodspeed
Anna Shubina
Sergey Bratus
Rebecca Shapiro
Jason Gener
Jon Stuart
Georg Wicherski
Mateusz Jurczyk
Gynvael Coldwind

+ many more

2

Addendum: Car CAN bus hacking


Did a subset of what Chris and Charlie did in March 2013,
presented at RECon June 2013
Hot-Wiring of the Future

Lots of &ps, (free, $) tools, workow/methodology, Costs: $6k (2


cars)

Sponsored 3 undergraduate students (knew nothing at all


about RE) who learned how to reverse, hook up boards,
use goodThopter and denial of view/manipula&on of CAN
dashboard in 3 months

User

hap://&nyurl.com/CarCAN2013

So`ware Package

GoodThopter10

CAN Bus

Work Accomplished - Methodology

Conrm Inferences,
Test Responsiveness

Boundary Analysis

Genera&ve Fuzzing

Conrm New
Inferences

A Case Study ID 513


ArbID Fuzzing Response
513

Rened Inferences

Dashboard components change Bytes 0, 1 = RPM


Byte 4 = Speedometer
4!

engineering.dartmouth.edu

7 March 2013

Higher Level Protocol: ID 513


Speedometer Reading (MPH)

120
100
80
60
40
20
0
0

20

40

60

80

100

120

Data Byte 4 Value


5!

engineering.dartmouth.edu

7 March 2013

Higher Level Protocol: ID 1056


1056
S
O
F

DLC = 8

Arbitra&on ID Control Field

Engine
Temp

Odometer

Ba[ery
Charge

Engine Clock

E
O
F

Data Field

Dashboard
Warnings

Check Fuel
Cap

6!

Unused

engineering.dartmouth.edu

Counter

7 March 2013

Signals: Side Channel


Side channels =
observables that are
emiaed by ac&ve systems
On a computer system,
these can be observed
&me, power, OS events,
EM radia&on, characteris&c
acous&c spectral signature
and more
2014: Mark Stoefngers
NTU group is doing cufng
edge punctuated hardware
magne&c eld side channel
analysis (SCA)

Graph from [Hund2013]. Generic &ming side


channel aaack against MMU system to infer
informa&on about the privileged address space
layout

Some innova&ve aaacks: data structures (2007
&ming aaacks against databases), protocols and
underlying algorithms (2007 QoS aaacks against
7
balancing algorithms, MMU (cache) and more

Side Channels as Consilient Evidence


Generalize: Use side channel evidence to reason about
system/sub-system (internal states)
Whewells Consilience of Induc&on

Concept of aggregate evidence


Convergence of several, ideally independent hypotheses serves to
strengthen conclusion

Ques&on: What side channels are available, easy to access,


analyze, expressive, type I/II error etc

Three Ques&ons on Signals


Want ac&onable handling of higher dimensional
signal dynamics as they occur in live computer
systems as side channels
Ac&onable is to be understood as useful in prac&ce
Higher dimensional refers to six or more data
dimensions
Dynamics emphasizes the signals unfolding in
temporal and feature space

Well discuss

How to represent such signals


How to detect signals with various characteris&cs
How to prevent or mi&gate the leaking of such signals
9

Signal Representa&on: Visuals

10

Signal Representa&on: Visuals & more

Fundamental cogni&ve limits for visuals very hard to overcome for


high dimensional representa&on
See Starlight project, NIDS graphs, Edward Tu`e book series,
Marty Applied Security Visualiza&on, Paley Visual Analy&cs
Rich gamut of of human senses remain neglected
Aural (hearing), hap&c (touch), ves&bular (balance and
accelera&on), kinesthe&c, thermocep&on (temperature), etc 11

Signal Detec&on: MINE Sta&s&cs


Maximal Informa&on-based
Nonparametric Explora&on
(MINE) sta&s&cs
General: Captures wide
range of associa-ons
between pairs of variables
(linear, exponen&al,
periodic, non-func&ons)
Equitable: Assigns similar
scores to equally noisy
rela&onships of dierent
types [Reshef2011sup]

Table from [Reshef2011]

MIC captures general rela-onship strength


MIC-r^2 captures non-linearity (Not shown)
MCN captures complexity
MAS captures departure from monotonicity
12
MEV captures closeness to being a func-on

Signal Detec&on: Associa&on Types

Graphs from [Reshef2011sup]

13

Signal Preven&on: Side Channel Leaks


Possible to control rate but not eliminate side channels
Recently, Goldwasser (MIT/Technion) and Rothblum oer
a prac&cal way forward
Resis&ng leakage at design (me and oers progress
towards formula(on of automa(c approaches that
generate leakage-resilience programs for a wide range of
side channel aaacks
Proved that for any computa&onally unbounded A
observing the results of computa&onally unbounded
leakage func&ons, will learn no more from its observa&ons
than it could given blackbox access only to the input-output
behavior of P
Result is uncondi(onal and does not rely on any secure
hardware components
14

Cyber Mission Planning


Cyber-opera&ons have poten&al to be more
pinpointed than kine&c counterpart
Minimize collateral damage by crisp targeted
opera&ons

However, unlike kine&c planning (centuries of well-


understood natural laws), cyber-planning lacks
founda-onal corpus of predic-ve laws

Natural Laws for/of/in Virtual Reality

15

Thread / Process/ Cache Execu&on


Behavior arises as a complex
interac&on of &ming of memory
requests (program behavior),
cache coherence protocol
(dependent on MA), or thread pre-
emp&on (depending on OS)
[Alistarh2014]
Modern opera&ng systems (OS)
and microarchitectures (MA) =
dynamic complex feedback system
that tries to con&nuously minimize
CPI (cycles per instruc&on)
Memory latency is boaleneck,
hence memory hierarchies from ns
to s

OS and MA con-nuously solve a


-me-space op-miza-on problem
to a[en (parallelize) sequen-al
processing

Minimize
expected data
latency based
on feedback

Program(s) (requires
space-time from OS/MA)

OS schedules thread
(s), positions data in
memory
(space/time
optimization)

Memory
hierarchy
(access
latency ns -
seconds)

Feedback
signals

Threads
Microarchitectures atomizes
and interleaves thread ops to
minimize CPI

Data flow
Interrupts (stochastic)
Hardware, user, I/O
16

Signal Detec&on: MINE/MIC


Maximal Informa&on-based
Nonparametric Explora&on (MINE)
sta&s&cs

Intui&on: (Simple) asset signals are
reected in convex/concave
parabola-type curves in -me

Iden&fy signals that are less periodic
(lower MAS), less linear (MIC-r^2),
but s&ll a func&on (higher MEV)

not a heartbeat, not a shoo(ng star,
but s(ll a func(on

MIC captures general rela-onship strength


MIC-r^2 captures non-linearity (Not shown)
MCN captures complexity
MAS captures departure from monotonicity
17
MEV captures closeness to being a func-on

Issue: Experimental Factors


Free NIST tools: Automated Combinatorial
Tes&ng for So`ware (ACTS) and Coverage
Measurement (CCM)
Cut down combinatorial explosion: 2-way, 3-
way, n-way tes&ng

18

Asset-Target Matching
Say asset tested on
congura&on A and it has 18
categories (e.g. language, OS/
patch, service running,
workload, etc) @ dozen of
values
How similar is the unknown
congura&on B to A?
Ques&on of distance

Table from [Boriah2013]

Easy enough for ra&o data (like


Kelvin), much harder for
categorical data (like OS type)

Success func&ons are not smooth but stepped


Table shows 14 categorical
distance (similarity) measures Can shove 10cm^3 (~elas&c) object through
9.8cm^3 hole > smooth success
Dier primarily how they
degrada&on
weigh matches and
Dierence between patch1 and patch2 is
mismatches between
dierence between works/doesnt work ->
categories
step/catastrophic success degrada&on 19

Case study: ROPe


ROPe: Detec&on of kernel-level ROP through branch return mispredic&ons
16 (N) entry shadow stack of call- sites / return addresses
0x89 BR_MISP_EXEC.*: mispredicted executed branches
0x800 .RETURN_NEAR: normal, near ret
0x8000 .TAKEN: uncondi&onal branch

PMC interrupts a`er certain number of mispredic&ons (N/2 = 8)

Upon interrupt, handler checks MSR Last Branch Recording (LBR) whether targets of
the previously executed instruc&ons are preceded by an instruc&on
If not -> likely ROP (chain) induced

Graph from G. Wicherski, SysCan 2013

Started telling Travis


Goodspeed at RECon
2013 about this and
a`er less than 8
seconds he exclaims
and I quote: Holy
cow! Thats a $^&*
brilliant idea once
you understand it! J
20

Sugges&ons for ROPe


Generalize ROP/0x8889 insight :
Side channel spectral signature
for variety of interes&ng aaacks
JOP, weird machine-inducers,
hardware-based aaacks,
Addi&onal OS/MA vents

Workplan (high-level):
Iden&fy signals of interest
Scope with MINE [Reshef2011]

Signal periodicity analysis


DSP, System Iden&ca&on tools

Scien&cally valid experiment


setup
Use procedures [Mont2012]

PMC measurements over &me for programs


in SPEC benchmark suite. Graph from
[Demme2013]. Good results with 4-dim
spectral signature from x86 MA events
0x0440 -- L1D_CACHE_LD.E_STATE
0x0324 -- L2_RQSTS.LOADS
0x03b1 -- UOPS_EXECUTED.PORT (1 or 2)
21
0x7f88 -- BR_INST_EXEC.ANY

Case study: Bowspwn

Study of Double fetch opera&ons

Two virtual address reads from kernel mode


thread close in &me
Virtual address concurrently writable by ring-3
threads

Assump&on of value consistency over &me


gives rise to race condi&on

User address space is shared across ring0 /


ring3
User-mode memory regions can be modied at
any &me by concurrent ring3 thread

Bochspwn idea: Extend &me window


between value Check and value Use to give
ring3 aaack tread opportunity to modify
value

Graphs from [JS2013a]

How? Bleed &me by slow cache line and page


boundaries, non-cacheability, TLB ushing
(2500x slowdown achieved)

69 (!) LaTex pages at SyScan 2013

Fundamental applied security paper, vital for


safer concurrent programming [JC2013a]
Renements: Flip interval dependence on value
(binary, arithme&c) types, logis&c S-curve
discussion [JC2013b]

22

Sugges&ons for Bochspwn


12 high level
computa(on
language
paRerns mined
from seven
general
applica(on
areas (green
&blue rare)
[Asan2009]

workload
(i.e. program)

PMC

Prac&cal: Find low level assembly


paaern transla&on and inves&gate
suscep&bility to double fetch and
resul&ng distor&ons/ error

Par(al Bayes
model of
RAM paging
behavior
(1995)

Theory: Inves&gate mul&-core control


systems (system scheduler, Paging)
23
and bring out assump&ons

Epilogue: Methods Blueprint


Signal Selec&on

Construct model of system


Iden&fy side channel observables
(OS/MA events & others)
Scope SCO s MINE proper&es
Use MIC/MINE sta&s&cs [Reshef2011]

Signal Representa&on & Analysis

Octave (free but not powerful enough), MATLAB (best choice)


Boxplots, Probability Plots
Toolboxes: Sta&s&cs, DSP, System Iden&ca&on

Machine Learning

Internalize [Dom2012]
Select ML procedures from [Murph2012] appropriate for and educed from a system model and
the signals macro-proper&es
[PMTK3] for Bayesian reasoning/modeling

Scien&c Experiments [Mont2012]


Specialized tools:

Signal Analysis: Eureka [Lipson2009]


Signal Representa&on: Viewpoints [Gazis2010]
24

Pro-Tip ML

Know what features your favourite ML algo selects


and weighs
Many blind spots possible
Study Domingos (2013)!
25

Current/upcoming R & D topics


Added to the talk is a short WP with a
selec&on of R & D issues
Concurrency Aaacks
Composi&onal Security
Systemic Computer Security

All these in my humble opinion benet from


SCA

26

Concurrency Aaacks
Even though we increasingly rely on concurrent execu&on, such
programs are much more dicult to write, test, debug.
Poten&al for serious concurrency errors in many widespread
concurrent programs, enabling feasible concurrency aRacks

Many sequen&al defense techniques , if unaware of concurrent


programming, are ineec&ve
Careful study of Bowspwn and ROPe will yield insights
Findings

Implications

A majority (24 out of 46) of the concurrency attacks corrupt


pointer data.

Existing memory safety tools, once made aware of concurrency, may be able to prevent concurrency attacks that corrupt pointer data.

9 concurrency attacks directly corrupt scalar data, such as


user identifiers, without compromising memory safety.

Few existing defenses handle attacks that directly corrupt


scalar data.

Many existing defenses become unsafe in the face of concurrency errors

These defenses must consider concurrent execution.

The exploitability of a concurrency error highly depends on


the duration of its vulnerable window (i.e., the timing window within which the concurrency error may occur).

New defense techniques may reduce the exploitability of


concurrency errors by reducing the duration of the vulnerable window.

Table 1: Summary of Findings.

27

Composi&onal Security

28

Systemic Computer Security


Mo&va&on: Flash Crash (May
2010) see my IEEE SP ar&cle
Automated black-box algorithmic
trading: Johnson (2013) Rise of
the Machines
phenomenological signatures of
interac&ng autonomous computer
agents in real-world dynamic
(trading) system
All-machine -me regime
characterized by frequent `black
swan events with ultrafast
dura&ons (<650ms for crashes,
<950ms for spikes

Aggregate behavior of simple


agents is unpredictable in
principle; no useful security
guarantees anent dynamics
possible

HFT Nanex 2010

29

Systemic Computer Security II


Aggregate behavior of simple agents is
unpredictable; no useful security guarantees
anent dynamics possible [Joh13] [Bil14]
Analysis of (side channel) event signatures in
phase space; design of circuit breakers,
graceful degrada&on, rec&ers
Relevance to Singapore Smart Ci&es (see
good/bad example Songdo, Portland)
30

Thank you
How Scien&sts Relax
Infrared spectroscopy on a
vexing problem of our &mes:
Truly comparing apples and
oranges
Thank you for your -me and
the considera&on of ideas.

I appreciate being at SyScan
and to nally visit Singapore
J

A spectrographic analysis of ground,


desiccated samples of a Granny Smith
apple and a Sunkist navel orange. Picture
from [San95]
31

References I
[Asan2009] K. Asanovic et al A view of the parallel compu&ng landscape, CACM
52:10, Oct2009, pp. 56-67 hap://dl.acm.org/cita&on.cfm?id=1562764.1562783

[Boriah2008] S. Boriah et al, Similarity Measures for Categorical Data, SIAM red,
30:2, 2008 hap://www-users.cs.umn.edu/~sboriah/PDFs/BoriahBCK2008.pdf

[Goldw2012] S. Goldwasser and G. Rothblum, "How to Compute in the Presence of
Leakage," FOCS, Oct. 2012, pp.31-40
hap://eccc.hpi-web.de/report/2012/010/download/

[Hund2013] R.Hund et al, Prac&cal Timing Side Channel Aaacks Against Kernel Space
ASLR IEEE S & P, 2013 , pp. 191-205
hap://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf

[Yang2012] J. Yang et al, Concurrency aaacks, USENIX HotPar, 2012
haps://www.usenix.org/system/les/conference/hotpar12/hotpar12-nal44.pdf

[JC2013a] M. Jurczyk &G. Coldwind, Iden&fying and Exploi&ng Windows Kernel Race
Condi&ons via Memory Access Paaerns, SyScan, April 2013
hap://j00ru.vexillium.org/?p=1695
32

References II

[Dom2012] P. Domingos, A Few Useful Things to Know about Machine Learning, CACM 55:10,
Oct 2012, pp. 78-87 haps://t.co/NsAnRUrPtq

[Demme2013] J. Demme et al, On the Feasibility of Online Malware Detec&on with Performance
Counters ISCA, 2013, pp. 559-570 hap://www.cs.columbia.edu/~jdd/papers/isca13_malware.pdf

[Mont2012] D. Montgomery, Design and Analysis of Experiments, Wiley Press, 2012, ch. 1
hap://higheredbcs.wiley.com/legacy/college/montgomery/1118146921/supp_material/ch01.doc

[Murph2012] K. Murphy, Machine Learning, MIT, 2012 hap://www.cs.ubc.ca/~murphyk/MLbook/

[Reshef2011] D. Reshef et al. "Detec&ng Novel Associa&ons in Large Data Sets" Science 334.6062
(2011): 1518-1524
hap://www.sciencemag.org/cgi/rapidpdf/334/6062/1518?
ijkey=cRCIlh2G7AjiA&keytype=ref&siteid=sci

[Reshef2011sup] D. Reshef et al. SOM [Reshef2011], Science 334.6062 (2011)
hap://www.sciencemag.org/content/334/6062/1502.full?ijkey=l9Qe0i/
BE6ZOI&keytype=ref&siteid=sci

[Wich2013] G. Wicherski "Taming the ROPe on Sandy Bridge, SyScan, April 2013
hap://www.syscan.org/index.php/download/get/3c6891f2e90e661ea23224cd8f419262/ 33
SyScan2013_DAY1_SPEAKER05_Georg_WIcherski_Taming_ROP_ON_SANDY_BRIDGE_syscan.zip

References III
[JC2013b] M. Jurczyk and G. Coldwind, Kernel double-fetch race condi&on exploita&on on
x86 further thoughts, blog, June 2013, hap://j00ru.vexillium.org/?p=1880

[Snyder2008] L. Snyder, The whole box of tools: William Whewell and the logic of induc&on,
Handbook of the History of Logic (Bri(sh Logic in the Nineteenth Century), Ed. : D. Gabbay, Vol
4, 2008, pp.163228

[Lipson2009] M. Schmidt and H. Lipson "Dis&lling Free-Form Natural Laws from Experimental
Data," Science 324:5923, 2009, pp. 81 85
hap://ccsl.mae.cornell.edu/sites/default/les/Science09_Schmidt.pdf

[Gazis2010] P. Gazis et al., Viewpoints: A High-Performance High-Dimensional Exploratory
Data Analysis Tool, Publica&ons of the Astronomical Society of the Pacic, 122(898), 2010,
pp. 1518-1525, hap://www.giss.nasa.gov/sta/mway/Gazis_Levit_Way2010.pdf

[PMTK3] K. Murphy, M. Dunham et al, probabilis&c modeling toolkit for Matlab/Octave,
2011, haps://github.com/probml/pmtk3

[San1995] S. Sandford, Apples and oranges: a comparison, Annals of Improbable Research
1:3, 1995 hap://www.improbable.com/airchives/paperair/volume1/v1i3/air-1-3-apples.html

[Alistarh2014] N. Shavit et al, Are Lock-Free Concurrent Algorithms Prac&cally Wait-Free?,
34
2014, hap://research.microso`.com/pubs/209106/paper.pdf

Вам также может понравиться