Presentation is loading. Please wait.

Presentation is loading. Please wait.

Man Cao Minjia Zhang Aritra Sengupta Michael D. Bond

Similar presentations


Presentation on theme: "Man Cao Minjia Zhang Aritra Sengupta Michael D. Bond"— Presentation transcript:

1 Man Cao Minjia Zhang Aritra Sengupta Michael D. Bond
Drinking from Both Glasses: Combining Pessimistic and Optimistic Tracking of Cross-Thread Dependences Optimistic tracking *means* something different from optimistic locking in the previous two talks. Man Cao Minjia Zhang Aritra Sengupta Michael D. Bond PPoPP 2016

2 Dynamic Analyses for Parallel Programs
Error detection Data Race Detector Atomicity Violation Detector Programming model Transactional Memory Enforcement of Strong Memory Model Debugging Record & Replay Deterministic Execution Software only

3 Dynamic Analyses for Parallel Programs
Bad performance! Error detection Data Race Detector Atomicity Violation Detector Programming model Transactional Memory Enforcement of Strong Memory Model Debugging Record & Replay Deterministic Execution Software only Bad performance in commodity systems; custom hardware can have good performance but is unrealistic

4 Dynamic Analyses for Parallel Programs
Bad performance! Error detection Data Race Detector Atomicity Violation Detector Programming model Transactional Memory Enforcement of Strong Memory Model Debugging Record & Replay Deterministic Execution Software only Bad performance in commodity systems; custom hardware can have good performance but is unrealistic Difficulties?

5 Cross-thread dependences
o.f = … … = o.f T1 T2 o.f is a shared memory location Crucial for dynamic analyses and systems

6 Cross-thread dependences
Tracking cross-thread dependences o.f = … … = o.f T1 T2 o.f is a shared memory location Crucial for dynamic analyses and systems

7 Cross-thread dependences
Tracking cross-thread dependences Detecting Controlling o.f = … … = o.f T1 T2 Crucial for dynamic analyses and systems To help understand the problem, suppose we are trying to build a dependence recorder.

8 Outline Dynamic Analyses and Cross-thread Dependences
Pessimistic Tracking Optimistic Tracking Our approach Hybrid Tracking Evaluation Motivates combining PT and OT to build hybrid tracking Evaluation shows better performance from HT

9 Pessimistic Tracking Per-object metadata: o.state
last writer/reader thread Per-variable vs per-object

10 Pessimistic Tracking Per-object metadata: o.state
last writer/reader thread At each object access: rd/wr o.f

11 Analysis-specific work
Pessimistic Tracking Per-object metadata: o.state last writer/reader thread At each object access: Check o.state Blue box: Instrumentation Analysis-specific work rd/wr o.f Update o.state

12 Analysis-specific work
Pessimistic Tracking Per-object metadata: o.state last writer/reader thread At each object access: Check o.state atomicity is the key performance issue. Analysis-specific work Atomic rd/wr o.f Update o.state

13 Analysis-specific work
Pessimistic Tracking Per-object metadata: o.state last writer/reader thread At each object access: Lock o.state Check o.state Create small critical section around each access Used by most existing work Pessimistically assumes accesses are involved in cross-thread dependences Analysis-specific work rd/wr o.f Update o.state Unlock o.state

14 Analysis-specific work
Pessimistic Tracking Per-object metadata: o.state last writer/reader thread At each object access: Lock o.state Check o.state Special locked state bit pattern Analysis-specific work NOT program locks rd/wr o.f Update o.state Unlock o.state

15 Pessimistic Tracking Lock o.state … wr o.f Unlock o.state Lock o.state
By using lock and unlock, analyses can soundly track cross-thread deps rd o.f Unlock o.state

16 Synchronization and write metadata
Pessimistic Tracking Synchronization and write metadata Lock o.state wr o.f Unlock o.state Lock o.state Remote cache misses Serialize concurrent read rd o.f Unlock o.state

17 Performance of Pessimistic Tracking Alone
340% Describe experiment methodology later Low overhead for some programs

18 Outline Dynamic Analyses and Cross-thread Dependences
Pessimistic Tracking Optimistic Tracking Our approach Hybrid Tracking Evaluation

19 Optimistic Tracking Biased reader-writer lock for o.state
Most accesses are non-conflicting and has no state change.

20 Optimistic Tracking Biased reader-writer lock for o.state
Avoid synchronization for non-conflicting accesses Most accesses are non-conflicting and has no state change.

21 Optimistic Tracking Biased reader-writer lock for o.state
Avoid synchronization for non-conflicting accesses Heavyweight coordination for conflicting accesses Define conflicting access. Most accesses are non-conflicting and has no state change – A worthwhile tradeoff. Introduce coordination – inter-thread communication/handshake protocol

22 Optimistic Tracking T1 T2 write check wr o.f o.state: WrExT1
Suppose we are trying to build a dependence recorder.

23 Optimistic Tracking T1 T2 write check wr o.f read check
o.state: WrExT1 read check o.state: WrExT1 Suppose we are trying to build a dependence recorder.

24 Analysis-specific work
Optimistic Tracking T1 T2 read check Analysis-specific work wr o.f write check o.state: WrExT1 o.state: WrExT1 Request Suppose we are trying to build a dependence recorder.

25 Analysis-specific work
Optimistic Tracking T1 T2 safe point read check Analysis-specific work change state wr o.f write check rd o.f o.state: WrExT1 o.state: WrExT1 Request Suppose we are trying to build a dependence recorder. Response o.state ← RdExT2

26 Performance of Optimistic Tracking Alone

27 Performance of Optimistic Tracking Alone
0.5% conflicting access, >100% overhead For most, < 0.1% accesses conflicting.

28 Cost of Different Tracking
Pessimistic Optimistic Same state Coordination 150 47 9200 PT’s cost is similar for both conflicting and non-conflicting access In CPU cycles Averaged across all programs

29 Optimistic tracking performs best if there are few conflicting accesses.

30 Pessimistic tracking is cheaper for conflicting accesses.

31 Drink from both glasses?
Goal Optimistic tracking for most non-conflicting accesses Pessimistic tracking for most conflicting accesses Sweet spot between PT and OT.

32 Outline Dynamic Analyses and Cross-thread Dependences
Pessimistic Tracking Optimistic Tracking Our approach Hybrid Tracking Evaluation

33 Outline Dynamic Analyses and Cross-thread Dependences
Pessimistic Tracking Optimistic Tracking Our approach Hybrid Tracking Evaluation Hybrid State Model Adaptive Policy Soundly and efficiently combines PT and OT. Online profiling to decide which tracking should be used for each object.

34 Outline Dynamic Analyses and Cross-thread Dependences
Pessimistic Tracking Optimistic Tracking Our approach Hybrid Tracking Evaluation Challenging! Hybrid State Model Adaptive Policy Challenging to build sound and efficient model that is suitable to build other analyses.

35 Outline Dynamic Analyses and Cross-thread Dependences
Pessimistic Tracking Optimistic Tracking Our approach Hybrid Tracking Evaluation Hybrid State Model Deferred Unlocking Adaptive Policy

36 Pessimistic-Optimistic Mismatch
Pessimistic Tracking Optimistic Tracking Lock o.state rd/wr o.f Unlock o.state safe point read check change state wr o.f write check rd o.f

37 Pessimistic-Optimistic Mismatch (#1)
Pessimistic Tracking Optimistic Tracking Lock o.state rd/wr o.f Unlock o.state safe point read check change state wr o.f write check rd o.f Each access could use pess or opt tracking Seems need conditional unlock OT seems always locked, transfer access privilege on demand No unlock!

38 Pessimistic-Optimistic Mismatch (#1)
Pessimistic Tracking Optimistic Tracking Lock o.state rd/wr o.f Unlock o.state safe point read check change state wr o.f write check rd o.f Each access could use pess or opt tracking Seems need conditional unlock OT seems always locked, transfer access privilege on demand Conditional unlock? No unlock!

39 Pessimistic-Optimistic Mismatch (#2)
Pessimistic Tracking Optimistic Tracking Lock o.state rd/wr o.f Unlock o.state safe point read check change state wr o.f write check rd o.f Atomic Atomic Atomicity could be interrupted at points that are statically unpredictable For T2, atomicity till next safe point

40 Pessimistic-Optimistic Mismatch (#2)
Pessimistic Tracking Optimistic Tracking Lock o.state rd/wr o.f Unlock o.state safe point read check change state wr o.f write check rd o.f Atomic Atomic Atomicity granularity *does* affect analyses, but just not going to explain it on this slide. Suppose dep recorder, need to maintain T.counter for mistic tracking, killing the benefit of optimistic tracking Atomicity granularity may affect specific analysis

41 Pessimistic-Optimistic Mismatch (#2)
Pessimistic Tracking Optimistic Tracking Lock o.state rd/wr o.f Unlock o.state safe point read check change state wr o.f write check rd o.f Atomic Atomic Unable to improve OT’s performance with conditional unlock. Instrumentation overhead. Significant modification to specific analysis, which adds more overhead and kills the performance benefit from hybridization. Atomicity granularity may affect specific analysis Conditional unlock

42 Key Insights Coarsening atomicity granularity for pessimistic tracking

43 Key Insights Coarsening atomicity granularity for pessimistic tracking
Program synchronization may hint at cross-thread dependences

44 Addressing Pessimistic-Optimistic Mismatch
Defer unlocking of pessimistic state Till program synchronization release operation (PSRO) Lock release, thread fork

45 Addressing Pessimistic-Optimistic Mismatch
Defer unlocking of pessimistic state Till program synchronization release operation (PSRO) Reader-writer locking Lock release, thread fork

46 Addressing Pessimistic-Optimistic Mismatch
Defer unlocking of pessimistic state Till program synchronization release operation (PSRO) Reader-writer locking Fall back to coordination on contention when locking a state Lock release, thread fork

47 Deferred Unlocking Example 1
synchronized (m) { } Lock o.state wr o.f Unlock all states synchronized (m) { Lock o.state rd o.f

48 Deferred Unlocking Example 2
synchronized (m) { } Lock o.state wr o.f Lock o.state safe point Avoids deadlock and ensure responsiveness With deferred unlocking, we build hybrid state model. Unlock all states rd o.f

49 Hybrid State Model Pessimistic locked and unlocked, optimistic
Deferred unlocking Transition between pessimistic and optimistic Introduce adaptive policy

50 Hybrid State Model Pessimistic locked and unlocked, optimistic
Deferred unlocking Transition between pessimistic and optimistic Introduce adaptive policy

51 Outline Dynamic Analyses and Cross-thread Dependences
Pessimistic Tracking Optimistic Tracking Our approach Hybrid Tracking Evaluation Hybrid State Model Deferred Unlocking Adaptive Policy Motivates combining PT and OT to build hybrid tracking

52 Adaptive Policy Decide when to transition between pessimistic and optimistic states Formalize the problem, approximate the model. Aggregate profiling, limit study shows per-object profiling gets nearly all possible benefits

53 Adaptive Policy Decide when to transition between pessimistic and optimistic states Cost—benefit model Formalize the problem, approximate the model. Aggregate profiling, limit study shows per-object profiling gets nearly all possible benefits

54 Adaptive Policy Decide when to transition between pessimistic and optimistic states Cost—benefit model Boil down to counting state transitions Formalize the problem, approximate the model. Aggregate profiling, limit study shows per-object profiling gets nearly all possible benefits

55 Adaptive Policy Decide when to transition between pessimistic and optimistic states Cost—benefit model Boil down to counting state transitions Online profiling Per-object Simple yet effective Formalize the problem, approximate the model. Aggregate profiling, limit study shows per-object profiling gets nearly all possible benefits. More advanced adaptive policy is orthogonal.

56 Application of Hybrid Tracking
Two dynamic analyses Hybrid dependence recorder and replayer (detect) Hybrid region serializability (RS) enforcer (control) Only recorder uses hybrid tracking

57 Application of Hybrid Tracking
Two dynamic analyses Hybrid dependence recorder and replayer (detect) Hybrid region serializability (RS) enforcer (control) Deferred unlocking helps overcome key challenges! Due to mismatch

58 Outline Dynamic Analyses and Cross-thread Dependences
Pessimistic Tracking Optimistic Tracking Our approach Hybrid Tracking Evaluation

59 Implementation Jikes RVM 3.1.3

60 Implementation Jikes RVM 3.1.3
Pessimistic tracking, optimistic tracking [Octet, Bond et al. OOPSLA’13] Optimistic recorder and replayer [Replay, Bond et al. PPPJ’15] Optimistic RS enforcer [EnfoRSer, Sengupta et al, ASPLOS’15]

61 Implementation Jikes RVM 3.1.3
Pessimistic tracking, optimistic tracking [Octet, Bond et al. OOPSLA’13] Optimistic recorder and replayer [Replay, Bond et al. PPPJ’15] Optimistic RS enforcer [EnfoRSer, Sengupta et al, ASPLOS’15] Hybrid tracking, hybrid recorder and replayer, hybrid RS enforcer publicly available DaCapo Benchmarks 2006 & 2009 SPEC JBB 2000 & 2005 32 cores (Intel Xeon E5 4620)

62 Performance of Tracking
28->22% DaCapo Benchmarks 2006 & 2009 SPEC JBB 2000 & 2005 32 cores (Intel Xeon E5 4620) 22%

63 Performance of Tracking
Adds low-overhead on top of optimistic tracking for low-conflict programs

64 Performance of Tracking
Most programs are low-conflict 22%

65 Performance of Recorders and RS enforcers
46->41% Reduction is less significant because still need to recorder similar amount of cross-thread dependences Recorders RS enforcers

66 Additional Materials Please check the paper
Stress Tests Complete State Transition Table Run-time Characteristics

67 Related work Analyses that use pessimistic tracking
[FastTrack, Flanagan & Freund, PLDI’09] [Velodrome, Flanagan et al., PLDI’08] [Chimera, Lee et al., PLDI’12] [Lightweight Transactions, Harris & Fraser, OOPSLA’03] [DMP, Devietti et al., ASPLOS’09] Analyses that use optimistic tracking [Shasta, Scales et al. ASPLOS’96] [Object Race Detection, von Praun & Gross, OOPSLA’01] [DoubleChecker, Biswas et al. PLDI’14] [LarkTM, Zhang et al, PPoPP’15] Adaptive Mechanisms [Adaptive Locks, Usui et al. PACT’09] [Strong Atomicity TM, Abadi et al. PPoPP’09] [Adaptive Lock Elision, Dice et al, SPAA’14] [Concurrency Control, Ziv et al, PLDI’15] Existing adaptive approaches do not address the problems with dep tracking.

68 Contributions Hybrid tracking combines pessimistic tracking and optimistic tracking effectively and efficiently Hybrid tracking achieves better overall performance Never significantly degrades performance Sometimes improves performance substantially Suitable for workload of diverse communication patterns


Download ppt "Man Cao Minjia Zhang Aritra Sengupta Michael D. Bond"

Similar presentations


Ads by Google