Presentation is loading. Please wait.

Presentation is loading. Please wait.

Context-Aware Fault Tolerance in Migratory Services Oriana Riva +, Josiane Nzouonta *, and Cristian Borcea * + ETH Zurich * New Jersey Institute of Technology.

Similar presentations


Presentation on theme: "Context-Aware Fault Tolerance in Migratory Services Oriana Riva +, Josiane Nzouonta *, and Cristian Borcea * + ETH Zurich * New Jersey Institute of Technology."— Presentation transcript:

1 Context-Aware Fault Tolerance in Migratory Services Oriana Riva +, Josiane Nzouonta *, and Cristian Borcea * + ETH Zurich * New Jersey Institute of Technology

2 2 Ad Hoc Networks as Data Carriers Traditionally, ad hoc networks used to Traditionally, ad hoc networks used to –Connect mobile systems (e.g., laptop, PDA) to the Internet –Transfer files between mobile systems Internet Internet Read email, browse the web File transfers

3 3 Ad Hoc Networks as People-Centric Mobile Sensor Networks Typical devices: smart phones and vehicular systems Typical devices: smart phones and vehicular systems Run distributed services Run distributed services –Acquire, process, disseminate real-time information from proximity of regions, entities, or activities of interest –Have context-aware execution –Often interact for longer periods of time with clients Entitytracking Parking spot finder Traffic jam predictor

4 4 Problems with Traditional Client-Server Model in Ad Hoc Networks When service stops satisfying context requirements, client must discover new service When service stops satisfying context requirements, client must discover new service –Overhead due to service discovery –State of the old service is lost –Not always possible to find new service

5 5 Virtual service end-point Migratory Services Model Client n1n1n1n1 C n2 n2n2 n2 n3 n3 n3 n3 Context Change! (e.g., n 2 moves out of the region of interest) MS cannot accomplish its task on n 2 any longer ServiceMigration MS State Migratory Service Service MS State Migratory Service Service

6 6 One-to-One Mapping between Clients and Migratory Services n1n1n1n1 C1 Meta-service n3n3n3n3 MCreate Migratory Service MS1 State n2n2n2n2 n5 n5 n5 n5 MS2 State C2 MS2 State n4 n4 n4 n4 MS1 State

7 7 Node Architecture for Migratory Services

8 8 The Fault Tolerance Problem How to provide service fault tolerance in mobile ad hoc networks? How to provide service fault tolerance in mobile ad hoc networks? Problem was not tackled so far! Problem was not tackled so far! Communication, software, and hardware faults are the norm rather than the exception Communication, software, and hardware faults are the norm rather than the exception

9 9Outline Introduction Introduction Migratory Services Overview Migratory Services Overview Context-Aware Fault Tolerance Mechanism Context-Aware Fault Tolerance Mechanism Prototype Implementation and Experimental Results Prototype Implementation and Experimental Results Simulation Results Simulation Results Conclusions Conclusions

10 Types of Faults Response failure – the response is semantically incorrect Response failure – the response is semantically incorrect –Solved implicitly through context-aware service migration Crash failure – the service omits to produce responses after several client requests Crash failure – the service omits to produce responses after several client requests –The service is considered failed –Requires fault tolerance mechanism 10

11 11 Primary Backup Approach Primary Backup Approach Secondary created when primary starts execution Secondary created when primary starts execution –Primary checkpoints its state and updates secondary periodically When primary fails, secondary takes over When primary fails, secondary takes overResponse Response Update Update Update Update Update Response Response Response ResponsePrimaryService SecondaryService Client Request Request Response Response Response On which node to place the secondary? On which node to place the secondary? How to detect failures and recover? How to detect failures and recover? Ack

12 Context-Aware Secondary Placement (1) 12 f = checkpointing frequency s = service state size d PC = distance between primary and client α = tuning parameter min, max values set by framework at primary Depends on physical distance Depends on physical distance –Secondary close to the client improves the recovery performance –Secondary close to the primary reduces checkpointing overhead

13 Context-Aware Secondary Placement (2) Depends on relative mobility and node resources Depends on relative mobility and node resources 13 d S For each car in the region, score: mob i = mobility trace of node i c j (p i,j ) = condition on resources p j at node i β = tuning parameter Primary broadcasts secondary discovery message in the region

14 Pull Recovery Client detects failure after timeout Client detects failure after timeout –Doesn’t receive periodic responses Client contacts secondary Client contacts secondary Secondary migrates to suitable node Secondary migrates to suitable node –Creates new secondary –Starts sending responses 14

15 Push Recovery Secondary detects failure after timeout Secondary detects failure after timeout –Doesn’t receive state updates –Migrates to suitable node –Starts sending responses 15

16 Transient Connectivity Failures 16 Client- Primary Client- Secondary Primary- Secondary Action XX Pull/Push recovery. Primary receives cancel message from client/new primary if it reappears. X Same as above. Secondary cancels the primary when it takes over. XX Client discovers new service and cancels the old primary if it reappears (old primary cancels old secondary). XXX Same as above. XX Primary creates new secondary and cancels the old one if it reappears. X Same as above. XNothing. Timeouts adapted dynamically to cope better with transient disconnections Timeouts adapted dynamically to cope better with transient disconnections

17 17Outline Introduction Introduction Migratory Services Overview Migratory Services Overview Context-Aware Fault Tolerance Mechanism Context-Aware Fault Tolerance Mechanism Prototype Implementation and Experimental Results Prototype Implementation and Experimental Results Simulation Results Simulation Results Conclusions Conclusions

18 Implementation Implemented in Java (J2ME) with CDC Implemented in Java (J2ME) with CDC –Used portable SM platform: migration state captured using bytecode instrumentation Nokia 9500 smart phones Nokia 9500 smart phones –Symbian 7.0, 150 MHz ARM processor, 76MB RAM Experiments in ad hoc network with 3 smart phones Experiments in ad hoc network with 3 smart phones 18

19 Checkpointing and Recovery Latency Checkpointing latency Checkpointing latency Recovery latency Recovery latency –Failover: 256 msec –Secondary Service discovery (1hop): 1850 msec –Backup SM migration, execution resumption, acknowledgment

20 20Outline Introduction Introduction Migratory Services Overview Migratory Services Overview Context-Aware Fault Tolerance Mechanism Context-Aware Fault Tolerance Mechanism Prototype Implementation and Experimental Results Prototype Implementation and Experimental Results Simulation Results Simulation Results Conclusions Conclusions

21 21 Simulation Setup NS-2 simulator, WiFi communication NS-2 simulator, WiFi communication 300 nodes in 6000 m X 1000 m urban grid: 10% standing, 20% walking, 10% running, 10% cycling, 50% driving 300 nodes in 6000 m X 1000 m urban grid: 10% standing, 20% walking, 10% running, 10% cycling, 50% driving Services Services –36 meta-services (do not fail) –Services monitor a certain region located at a fixed distance from client (e.g., traffic jam predictor for drivers) Routing Routing –Service discovery: geographical forwarding to reach region, broadcast to find node in region –Client/service communication: geographical forwarding Comparison function of secondary placement Comparison function of secondary placement –Adaptive – placed using our context aware method –Client – placed at the client node –Neighbor - placed at a random neighbor of primary

22 Checkpointing Overhead No failures No failures Expected: Adaptive outperforms Client Expected: Adaptive outperforms Client Unexpected: Adaptive better than Neighbor Unexpected: Adaptive better than Neighbor –Due to (mis)matched relative mobility Adaptive also scales better with number of clients Adaptive also scales better with number of clients 22

23 23 Recovery Ratio and Latency Worst case scenario: all primaries switched off at once Worst case scenario: all primaries switched off at once Expected: Adaptive outperforms Neighbor Expected: Adaptive outperforms Neighbor Unexpected: Adaptive better than Client Unexpected: Adaptive better than Client — Recovery requires secondary to migrate in the desired region  service could be lost/it takes longer to recover for longer paths

24 Conclusions Migratory Services enable distributed services for people- centric mobile sensing in ad hoc networks Migratory Services enable distributed services for people- centric mobile sensing in ad hoc networks Context-aware fault tolerance mechanism helps surviving crash failures Context-aware fault tolerance mechanism helps surviving crash failures –To the best of our knowledge, first service fault tolerance mechanism for mobile ad hoc networks –Prototype implementation on smart phones demonstrated feasibility –Simulation results showed good recovery performance and low overhead 24

25 25 Thank you! http://www.cs.njit.edu/~borcea/ This work is sponsored in part by the NSF grants CNS- 0520033, CNS-0454081, IIS-0534520, and IIS-0714158

26 26 Backup slides

27 27 TJam: Migratory Service Example TJam: Migratory Service Example Predicts traffic jams in real-time Predicts traffic jams in real-time –The request specifies region of interest –Service migrates to ensure it stays in this region –Uses history (service execution state) to improve prediction TJam utilizes information that every car has: TJam utilizes information that every car has: –Number of one-hop neighboring cars –Speed of one-hop neighboring cars Inform me when there is a high probability of traffic jam 10 miles ahead

28 28 TJam Pseudo-Code monitoredCtx = {location, speed} inCtxRule = {, rejectResponse && sendUpdate} rejectResponse && sendUpdate} request = {clientName, region} send(TJam, request); while (NOT_DONE) response = receive(msName) Client monitoredCtx = {location, speed, region} outCtxRule = {, migrateService} migrateService} while (NOT_DONE) tjam_p = computeTJamProbability(); if (tjam_p>MAX_PROB) send(clientName, tjam_p) MigratoryService

29 Memory Consumption Java on Nokia 9500 has 12.7 MB of RAM Client: 68.827 KB (0.54%) 0.89% 0.68% 2.57% 2.38%


Download ppt "Context-Aware Fault Tolerance in Migratory Services Oriana Riva +, Josiane Nzouonta *, and Cristian Borcea * + ETH Zurich * New Jersey Institute of Technology."

Similar presentations


Ads by Google