Download presentation

Presentation is loading. Please wait.

Published byJasmine O'Connell Modified over 2 years ago

1
Evaluating Provider Reliability in Risk-aware Grid Brokering Iain Gourlay

2
2 Outline AssessGrid background Problem Statement Basic Reliability Analysis of behaviour Stationarity Problem Weighted Reliability Simulations and Results What if a provider is unreliable? Alternative: Bayesian Inference Summary and Conclusions

3
3 AssessGrid Background AssessGrid addresses Risk Management in the Grid. This is a necessity in the drive towards commercialisation of Grid technology… - The goal is to move beyond best-effort, using SLAs to specify agreed upon level of service. However, - For resource providers, offering an SLA with service guarantees and penalties is a business risk! - For end-users, agreeing to an SLA is a business risk! A large part of AssessGrid is concerned with methods to support providers with tools and methods to: - Monitor and collect useful data. - Assess risk associated with accepting an SLA request, based on this data.

4
4 What is risk? Risk is Hazard, danger, exposure to mischance or peril (Oxford English Dictionary). Risk Management is a discipline that addresses the possibility that future events may cause adverse events. - Economics, Operations Research, Engineering, Gambling, … In Risk Management, risk is quantified with two parameters: Risk = Probability of Occurrence x Impact Grid computing: Event is SLA failure!

5
5 Scenario

6
6 Role of the Broker Key role: Finding/Negotiating with providers on behalf of end-users. Broker can also act as an independent party: - Providers may have motivation to lie! - Providers may have unidentified problems in their infrastructure. Here, we assume the broker is independent and honest. Broker can give a second opinion on risk assessments. Broker can agree its own SLAs (virtual provider).

7
7 Problem statement: What do we mean by reliability? A provider makes an SLA offer: - includes an estimate of the Probability of Failure (PoF). Each time an offer is accepted, the details are stored in a database, including: - Final status (Success/Fail) - Offered PoF The problem is: Given a providers past data, can their risk assessments be considered reliable?

8
8 What is reliable? Considering only systematic errors! Assume s SLAs in the database for the same provider. - Offered PoFs, Assume number of fails ~ We define a reliable provider as one that does not systematically underestimate or overestimate the PoF, so that:

9
9 Is it normal?

10
10 Is it normal? (2)

11
11 Basic Reliability: Identifying Systematic Errors Using the providers offered PoFs: The evaluation is based on the following measure:

12
12 Basic Reliability: Identifying Systematic Errors(2)

13
13 Basic Reliability: Identifying Systematic Errors(3) We note that and recall the condition, leading to

14
14 Analysis: How does the measure behave? Simple Example: m SLAs in database. Offered PoF is constant, p. There is a systematic overestimation/underestimation of the PoF, such that:

15
15 Analysis (2)

16
16 Stationarity Problem Conditions are not static! - Example: 60 red balls in a bag. 40 blue balls in the same bag. You try to estimate the number of red balls by taking a ball out and replacing it, repeating this 50 times. Someone is secretly removing a red ball and replacing it with a blue after every sample. E(red) =17.5 Number of reds =10!

17
17 Stationarity Problem(2) A providers behaviour could change as a consequence of a variety of factors, e.g. A providers infrastructure is updated. A providers risk assessment methodology or model parameterisation may change. A providers policy may change, for example due to economic considerations.

18
18 Weighted Reliability Use a weighted average, ensuring more recent SLAs have a larger influence. Total of mk SLAs are split into k categories, with the k th consisting of the most recent SLAs. Here, is the basic measure R over the i th category.

19
19 Simulations A database of SLAs is generated: - Each SLA object has an offered PoF, true Pof and final status. Reliability computed. Process repeated times for each scenario. Simple case considered here: - Offered PoF is fixed and true PoF is fixed.

20
20 Results

21
21 Results(2)

22
22 Results (3)

23
23 Results(4)

24
24 Results (5)

25
25 What if the provider is unreliable? Discrete approximation: When SLA Offer received with offered POF of p, estimate POF by looking at failure rate for all SLAs with offered POF of ~p. Then, If (|reliability measure| < threshold) Believe provider. Else(PoF estimate = numFails(POF~p)/numSLAs(POF~p) Use all SLAs with offered PoF within x% of the offered PoF in the current SLA.

26
26 Weighted Average risk assessment Split km SLAs into k categories. Compute the estimate PoF, for each category, i=0,…,k-1.

27
27 Never Trust Doctors You are tested for a disease, which 2% of the population has. The test never gives a false-negative. If you are clear, there is still a 5% chance of a false positive. You test positive. What is the probability you have the disease?

28
28 Alternative Approach: Bayesian Inference The provider offers a linguistic risk assessment, e.g. the failure probability is: - extremely low: <1% - very low: 1-5% - low: 5-10% - medium: 10-20% - high: 20-30% - very high:30-50% - extremely high: >50% If the broker/end-user requests the PoF exact value this can be provided.

29
29 Alternative Approach: Bayesian Inference (2) The broker does not consider the providers reliability directly. Instead it takes the following approach: - Having received a linguistic risk assessment for a new SLA, the broker first computes a prior distribution for the PoF, given the linguistic category by considering data across all other providers. - The broker computes a posterior distribution, based on the failure rate observed in past SLAs from the same provider with the same linguistic risk assessment. - The broker returns an object which contains: (PoF_broker, confidence)

30
30 Alternative Approach: Bayesian Inference (3)

31
31 Summary/Conclusions A detailed analysis has been carried out for a method to identify providers who are systematically unreliable. The stationarity problem has been addressed. - Weighted Average - Results indicate good performance relative to basic measure and moving average. This can be extended to other measures for non-systematic errors. Bayesian approach has been considered and is also promising.

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google