Академический Документы
Профессиональный Документы
Культура Документы
Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Two-Phase Rendering
while (true) { if (phase) { frame[0].display(); } else { frame[1].display(); } phase = !phase; } while (true) { if (phase) { frame[1].prepare(); } else { frame[0].prepare(); } phase = !phase; }
Two-Phase Rendering
while (true) { if (phase) { frame[0].display(); } else { frame[1].display(); } phase = !phase; } while (true) { if (phase) { frame[1].prepare(); } else { frame[0].prepare(); } phase = !phase; }
Even phases
Art of Multiprocessor Programming 6
Two-Phase Rendering
while (true) { if (phase) { frame[0].display(); } else { frame[1].display(); } phase = !phase; } while (true) { if (phase) { frame[1].prepare(); } else { frame[0].prepare(); } phase = !phase; }
odd phases
Art of Multiprocessor Programming 7
Synchronization Problems
How do threads stay in phase? Too early?
we render no frame before its time
Too late?
Recycle memory before frame is displayed
10
11
Uh, oh
Art of Multiprocessor Programming 12
Barrier Synchronization
0 0 0
barrier
13
Barrier Synchronization
barrier
1 1 1
14
Barrier Synchronization
barrier
Why Do We Care?
Mostly of interest to
Scientific & numeric computation
Elsewhere
Garbage collection Less common in systems programming Still important topic
16
Duality
Dual to mutual exclusion
Include others, not exclude them
17
after
a+b
a+b+c a+b+c +d
18
Parallel Prefix
19
a+b
b+c
c+d
20
a+b
a+b+c a+b+c +d
21
Parallel Prefix
N threads can compute
Parallel prefix Of N entries In log2 N rounds
22
Prefix
class Prefix extends Thread { int[] a; int i; Barrier b; void Prefix(int[] a, Barrier b, int i) { a = a; b = b; i = i; }
Art of Multiprocessor Programming 23
Prefix
class Prefix extends Thread { int[] a; int i; Barrier b; void Prefix(int[] a, Barrier b, int i) { a = a; Array of input b = b; values i = i; }
Art of Multiprocessor Programming 24
Prefix
class Prefix extends Thread { int[] a; int i; Barrier b; void Prefix(int[] a, Barrier b, int i) { a = a; Thread index b = b; i = i; }
Art of Multiprocessor Programming 25
Prefix
class Prefix extends Thread { int[] a; int i; Barrier b; void Prefix(int[] a, Barrier b, int i) { a = a; Shared barrier b = b; i = i; }
Art of Multiprocessor Programming 26
Prefix
class Prefix extends Thread { int[] a; int i; Initialize fields Barrier b; void Prefix(int[] a, Barrier b, int i) { a = a; b = b; i = i; }
Art of Multiprocessor Programming 27
28
29
30
32
Barrier Implementations
Cache coherence
Spin on locally-cached locations? Spin on statically-defined locations?
Latency
How many steps?
Symmetry
Do all threads do the same thing?
Art of Multiprocessor Programming 33
Barriers
public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}}
Programming
34
Barriers
public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); Number threads size = n; not yet arrived } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}}
Programming
35
Barriers
public class Barrier { AtomicInteger count; int size; Number threads public Barrier(int n){ participating count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}}
Programming
36
Barriers
public class Barrier { Initialization AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}}
Programming
37
Barriers
public class Barrier { AtomicInteger count; Principal method int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}}
Programming
38
Barriers
public class Barrier { If Im last, reset AtomicInteger count; fields for next time int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}}
Programming
39
Barriers
public class Barrier { AtomicInteger count;Otherwise, wait for int size; everyone else public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}}
Programming
40
Barriers
public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; Whats wrong with this protocol? } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}}
Programming
41
Reuse
Barrier b = new Barrier(n); while ( mumble() ) { Do work work(); repeat synchronize b.await() }
42
Barriers
public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}}
Programming
43
Barriers
public class Barrier { AtomicInteger count; Waiting for Phase 1 to finish int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}}
Programming
44
Barriers
Phase 1 public class Barrier { is so over AtomicInteger count; Waiting for Phase 1 to finish int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}} Programming
45
Barriers
Prepare for public class Barrier { phase 2 ZZZZZ. AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}} Programming
46
Uh-Oh
public class Barrier { Waiting for AtomicInteger count; Phase 2 to finish int size; public Barrier(int n){ Waiting for count = AtomicInteger(n); Phase 1 to finish size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); Art of Multiprocessor }}}} Programming
Basic Problem
One thread wraps around to start phase 2 While another thread is still waiting for phase 1 One solution:
Always use two barriers
48
Sense-Reversing Barriers
public class Barrier { AtomicInteger count; int size; boolean sense = false; threadSense = new ThreadLocal<boolean> public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}
Art of Multiprocessor Programming 49
Sense-Reversing Barriers
public class Barrier { Completed odd or AtomicInteger count; even-numbered int size; boolean sense = false; phase? threadSense = new ThreadLocal<boolean> public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}
Art of Multiprocessor Programming 50
Sense-Reversing Barriers
public class Barrier { Store sense for AtomicInteger count; next phase int size; boolean sense = false; threadSense = new ThreadLocal<boolean> public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}
Art of Multiprocessor Programming 51
Sense-Reversing Barriers
public class Barrier { AtomicInteger count; int size; Get new sense determined boolean sense = false; by last phase threadSense = new ThreadLocal<boolean> public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}
Art of Multiprocessor Programming 52
Sense-Reversing Barriers
public class Barrier { AtomicInteger count; int size; boolean sense = false; If Im last, reverse threadSense = new ThreadLocal<boolean>
public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}
Art of Multiprocessor Programming 53
Sense-Reversing Barriers
public class Barrier { AtomicInteger count; Otherwise, wait for int size; sense to flip boolean sense = false; threadSense = new ThreadLocal<boolean> public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}
Art of Multiprocessor Programming 54
Sense-Reversing Barriers
public class Barrier { AtomicInteger count; Prepare sense for next int size; phase boolean sense = false; threadSense = new ThreadLocal<boolean> public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}
Art of Multiprocessor Programming 55
2-barrier
2-barrier
56
2-barrier
2-barrier
57
Cache behavior
Local spinning on bus-based architecture Not so good for NUMA
Art of Multiprocessor Programming 65
Remarks
Everyone spins on sense field
Local spinning on bus-based (good) Network hot-spot on distributed architecture (bad)
66
At level i
If i-th bit of id is 0, move up Otherwise keep back
67
winner
loser
winner
loser
winner
Art of Multiprocessor Programming
loser
68
69
74
75
Tournament Barrier
class TBarrier { boolean flag; TBarrier partner; TBarrier parent; boolean top; }
76
Tournament Barrier
class TBarrier { boolean flag; TBarrier partner; TBarrier parent; boolean top; }
77
Tournament Barrier
class TBarrier { boolean flag; TBarrier partner; TBarrier parent; boolean top; }
78
Tournament Barrier
class TBarrier { boolean flag; TBarrier partner; TBarrier parent; boolean top; }
79
Tournament Barrier
class TBarrier { boolean flag; TBarrier partner; TBarrier parent; boolean top; }
Am I the root?
80
Tournament Barrier
void await(boolean mySense) { if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {}; }}}
Art of Multiprocessor Programming 81
Tournament Barrier
void await(boolean mySense) { if (top) { return; Le root, cest moi } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {}; }}}
Art of Multiprocessor Programming
Current sense
82
Tournament Barrier
void await(boolean mySense) { I am already a if (top) { winner return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {}; }}}
Art of Multiprocessor Programming 83
Tournament Barrier
void await(boolean mySense) { Wait for partner if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {}; }}}
Art of Multiprocessor Programming 84
Tournament Barrier
void await(boolean mySense) { if (top) { Synchronize upstairs return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {}; }}}
Art of Multiprocessor Programming 85
Tournament Barrier
void await(boolean mySense) { if (top) { Inform partner return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {}; }}}
Art of Multiprocessor Programming 86
Tournament Barrier
void await(boolean mySense) { if (top) { Inform partner return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {}; Order is important (why?) }}}
Art of Multiprocessor Programming 87
Tournament Barrier
void await(boolean mySense) { if (top) { Natural-born loser return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {}; }}}
Art of Multiprocessor Programming 88
Tournament Barrier
void await(boolean mySense) { if (top) { Tell partner Im here return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {}; }}}
Art of Multiprocessor Programming 89
Tournament Barrier
void await(boolean mySense) { if (top) { Wait for notification return; from partner } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {}; }}}
Art of Multiprocessor Programming 90
Remarks
No need for read-modify-write calls Each thread spins on fixed location
Good for bus-based architectures Good for NUMA architectures
91
Dissemination Barrier
At round i
Thread A notifies thread A+2i (mod n)
92
Dissemination Barrier
+1 +2 +4
93
Remarks
Elegant Good source of homework problems Not cache-friendly
94
Ideas So Far
Sense-reversing
Combining tree
Reuse without reinitializing
Tournament tree
Dissemination barrier
96
98
100
2 1
2 1
102
2 1
103
2 1
104
2 1
105
0 1
106
0 1
107
1 0
0 1
108
0 1
109
0 1
110
yes! yes!
0 1
yes!
yes!
yes!
111
2 1
112
Remarks
Very little cache traffic Minimal space overhead On message-passing architecture
Send notification & sense down tree
113
Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
* See Chapter 3 in the textbook and http: //www.cs.tau.ac.il/~shanir/progress.pdf
Concurrent Programming
Many real-word data structures
blocking (lock-based) implementations & non-blocking (no locks) implementations
For example:
linked lists, queues, stacks, hash maps,
115
Concurrent Programming
Many data structures combine blocking & non-blocking methods Java concurrency package
skiplists, hash tables, exchangers on 10 million desktops.
116
Progress Conditions
Deadlock-free: Starvation-free: Lock-free:
Some thread eventually acquires lock. Every thread eventually acquires lock.
Wait-free:
Obstruction-free:
117
List-Based Sets
Unordered collection of elements No duplicates Methods
Add() a new element Remove() an element Contains() if element is present
118
b c
Add(), Remove(), contains() lock destination nodes in order Deadlock-free: some thread trying to acquire the locks eventually succeeds.
121
Obstruction-free contains()
a b c d
Snapshot: if all nodes traversed twice are the same Obstruction-free: the method returns if it executes in isolation for long enough.
122
Collect2
1
Otherwise,
Try again
22 1 7 13 18 12 123
Obstruction-freedom
In the simple snapshot alg:
The update method is wait-free But scan is obstruction-free
Completes if it executes in isolation (no concurrent updates).
124
Wait-free contains()
d 0 c1 e 0
a 0
b 0
Use mark bit + list ordering 1. Not marked in the set 2. Marked or missing not in the set
125
a 0
b 0
Combine blocking and non-blocking: deadlockfree Add() and Remove() and wait-free Contains()
126
Lock-free Algorithm
Logical Removal = Set Mark Bit
a 0
b 0
c 0 c1 d 0
e 0
128
Progress Conditions
Deadlock-free: Starvation-free: Lock-free:
Some thread eventually acquires lock. Every thread eventually acquires lock.
Wait-free:
Obstruction-free:
130
More Formally
Standard notion of abstract object Progress conditions relate to method calls of an object
A thread is active if it takes an infinite number of concrete (machine level) steps And is suspended if not.
131
Maximal progress
every call eventually completes. Individuals matter
Flags courtesy of www.theodora.com/flags used with permission
132
Blocking
Lockfree
Deadlockfree
133
134
Fair Scheduling
A history is fair if each thread takes an infinite number of steps A method implementation is deadlock-free if it guarantees minimal progress in every fair history.
135
Starvation Freedom
A method implementation is starvation-free if it guarantees maximal progress in every fair history.
136
Dependent Progress
Dependent progress conditions
Do not guarantee minimal progress in every history
137
138
Blocking
Independent
140
141
Blocking
free
free
Independent
Dependent
142
Blocking
Deadlockfree
Independent
Dependent
143
144
Blocking
Independent
Dependent
145
s149
Why Lock-Free is OK
We all want maximal progress
Wait-free
Yet we often write lock-free or lockbased algorithms OK if we expect the scheduler to be benevolent
Often true (not always!)
Art of Multiprocessor Programming 150
Shared-Memory Computability
Shared Memory
10011
What is (and is not) concurrently computable Wait-free Atomic Registers Lock-free/Wait-free Hierarchy and Universal Constructions
151
Why use non-blocking lock-free and wait-free conditions when most code uses locks?
152
The Answer
Not about being non-blocking About being independent! Do not rely on the good behavior of the scheduler.
153
By Analogy to Church-Turing
abanbnan
Finite State Controller Reads and Writes 10 1 1 0 1 0 Infinite tape
Using a dependent condition is like relying on an oracle to recognize languages The dependency masks the true power of the concurrent object
154
Shared-Memory Computability
Shared Memory
10011
Independent progress: use Lock-free and Wait-free Memory Hierarchy and Universal Constructions
155
Summary
Table explains how conditions fit together Justifies common assumptions We expect maximal progress. Progress conditions define scheduler requirements.
156
157