Utku ÖZBEK 2006703363. Outline Introduction Group Testing For Reliability Application of this Reliability Model Weather services application Results of.

Utku ÖZBEK 2006703363

Outline Introduction Group Testing For Reliability Application of this Reliability Model Weather services application Results of the application ASTRAR Group Testing Application of this Group Testing Model A real-time stock-buy-sell web service application Results of the application Conclusion References Question & Answer

Introduction Software development is shifting from the product-oriented paradigm to the service-oriented paradigm Web Services (WS) services that are offered through Web and Internet technology Examples: tax return service, stock ranking service, and equations-solving service can be offered by many service providers, based on the same theories but different implementations trustworthiness and dependability problem

Introduction Web Services (WS) Under SOA and WS, a system consists of a collection of loosely coupled services These services can make use of each other's services to achieve their own desired goals and end results Simple services can cooperate in this way to form a complex or composite service dynamically and at runtime

Introduction History on WS testing: In phase one, WS are essentially tested like ordinary software In phase two (2003-2005), the following are included in testing: publishing, finding, and binding capabilities of an SOA (Service- Oriented Architecture) the asynchronous capabilities of WS the SOAP (Simple Object Access Protocol) intermediary capability the quality of services. In phase three (2004 and beyond), the following are included in testing: dynamic runtime capabilities, WS versioning, and WS orchestration testing, which invoke remote WS in a specific order to test their interoperability.

Introduction History on WS testing: Both clients and service providers must be involved in WS testing issues must be addressed during WS development including: Security Interoperability UDDI (Universal Description, Discovery, and Integration) registration Performance considerations

Introduction This presentation proposes a Service-Oriented software Reliability Model (SORM) This model evaluates the reliability of WS in two steps: Use highly efficient group testing to evaluate the reliability of atomic services Evaluate the reliability of a composite service based on the reliabilities of the component services a technique to test large number of WS simultaneously to determine the oracle and correctness of the WS under test by majority voting to provide quality ranking of WS and the test cases

Group Testing For Reliability WebStrar Web Services Testing Reliability Assessment Ranking services Directory services

Group Testing For Reliability The WebStrar can take registration from service providers and various kinds of service brokers It considers the services registered as atomic services and uses them to compose composite services An atomic service is a service agent submitted by a service provider that does not call other WS and thus should be treated as a unit that is not to be broken like atom A composite service is a service agent submitted by a service provider that uses (calls) other WS Both atomic and composite services can be provided by the WebStrar directly to the clients

Group Testing For Reliability Group testing technique was originally developed for testing large samples of blood It is used to test complex composite WS at runtime It tests the contamination of an entire group of services by applying one test.

Group Testing For Reliability Assume CSn is a composite service consisting of n services S1, S2,..., Sn, where Si can be an atomic service Assume services S11, S12,..., S1m are functionally equivalent to the service S1 in CSn. We can forward (broadcast) the input to S1 to S11, S12,..., S1m The results from all services, including that from S1, are voted by a voting service The voting is weighted based on the current reliabilities of the services under test. The voting service can set the initial weight of each incoming service to zero while the exiting service S1's weight to the reliability R(S1) The voting service detects faults by comparing the output of each service with the weighted majority output. A disagreement indicates a fault.

Group Testing For Reliability The reaibility of the services is calculated using formula: Where: the reliability of service S at time point t is R(S, ∆t) In the next ∆t time, k runs are executed and f disagreements have been detected M is the total number of tests that the service has ever been tested

The advantages of the model include: One of the toughest problems in software testing is to construct an oracle that can determine if a fault has occurred. In this model, the voting service serves as the oracle according to the majority principle. The model estimates the reliability of each incoming service while performing the normal operation. In other words, the incoming services are tested in the real operational environment at no extra time if sufficient computing power is available. The model is dynamic, i.e., the data are collected and computed at runtime in real time. The reliability of each service involved in group testing is updated after each run or after a given period of time. Group Testing For Reliability

One situation in which SORM would not work well is : when there are no alternative services available In this case, the SOA is basically degraded to the traditional software architecture: The service is only tested by the service provider in its development cycle. However, this is an unlikely situation because SOA is an open platform that allows and encourages cooperation and competition among service providers to create increasingly improved services.

Application of Reliability Models Examples to illustrate the applications of the proposed service-oriented reliability model: Assume a space agency plans to launch a satellite on a specific date and from a specific location The launch is heavily depends on the weather conditions in the launch location, including rain, wind, and temperature They designed 10 independent weather services, each of which offers three component services: RainForecast, TempForecast, and WindForecast The forecasts are given by their probabilities

Evaluation of component Services To build a trust on the reliability of the component services, the space agency puts them in a group testing framework, and sets their initial reliability to zero After a period of group testing, the space agency has the reliability estimation of each service. Table 2 shows a set of sample results obtained in their experiments.

Evaluation of component Services The first column of the table lists the component services under test. The second column shows the highest reliability of the service in the given test period. Column 3 shows the forecasted probabilities of heavy rain, extreme temperature, and the strength of the wind, respectively, of the component services. Column 4, shows the adjusted forecast probabilities from the service by taking the reliability of the service into account, which are the final evaluation values for the component services

Evaluation of Composite Services To base the decision whether to change the launch date on the most accurate whether forecasting information, the space agency then constructed a composite service, as shown in Figure 3. The decision is based on these two factors: The numbers in the diamond boxes are the reliability of the best component services The numbers on the branches are the probabilities forecasted by the best service

Evaluation of Composite Services

Assume that the plan of launch is made a year before the launch date. The composite service is up and running from day one. At the beginning, the space agency would have little data about the reliability of each service and the weather forecast a year before the launch date won't be accurate too. However, by the time, say a month or a week before the launch, we already have sufficient data about the reliability of the services. These reliability data will be used in the future applications too. When the agency plans its next launch, or another event that needs weather forecast, it already has the reliability data.

Results of the application Design Of Experiment (DOE) is an engineering technique that can be used to determine the extend of the impact of the parameters (factors) of a model on the final results. applies DOE to analyze the impact of the reliability of the component services on the reliability of the composite service three factors in our example, the reliabilities of RainForecast TempForecast WindForecast They use 2 level DOE techniques, i.e., use high and low values of each factor: RainForecast (70%, 90%), TempForecast (90%, 99%) and WindForecast (85%, 95%). The 3-factor and 2-level design generated an ANOVA (ANalysis Of Ariance) table

Results of the application The F-Value represents the significance of the impact of a model and its components. In general, if a component generates a significance value “Prob>F-Value” of less than 0.05, the impact of the componet is significant

Results of the application The experiment results in Table 3 also show that the FValues and significances of RainForecast, TempForecast, and WindForecast are all less than 0.0001, and thus they are all significant model components

Results of the application the higher the component reliability, the higher the overall reliability The impact of the RainForecast service is much more significant than that of the others The space agency should pay more attention to the quality of rainforecast service- provider

Results of the application The evaluation process is dynamic and at runtime. The vastly available WS on-line make it necessary to perform group testing, which, in turn, makes it possible to identify the correct service output without having to design an oracle

ASTRAR Group Testing a technique to test large number of WS simultaneously, to determine the oracle and correctness of the WS under test by majority voting, and to provide quality ranking of WS and the test cases. can be used by WS service providers, brokers, and clients. A WS provider or client can use the technique to find the best WS for composing new services or applications. For example, a WS provider can compose a digital imaging using the Fast Fourier Transformation service as a component service. A WS broker can use the technique to evaluate the quality of WS trying to be registered to make sure only WS with reasonable quality will be offered to the public.

ASTRAR Group Testing These techniques are used to rank different WS implementations based on the same specification, the same business logic, and the same input and internal states. In other words, the WS under group testing should produce the same or close results if the same inputs are applied, e.g., various Fast Fourier Transformation WS should produce the same or close results based on the same input

ASTRAR Group Testing technique proposed here has the following advantages: It can test large number of WS rapidly and rank them according to the test results It can automatically create the oracle of test cases, i.e.,the expected outputs for the given inputs It can rank the effectiveness of test cases and thus apply the most effective test cases first to eliminate unacceptable WS quickly Most of the steps in the process can be completely automated and this feature makes this process attractive for commercial applications.

ASTRAR Group Testing A group testing technique, originally developed for testing a large number of blood samples and later for software regression testing is an attractive solution to address: The Service-Oriented Architecture (SOA) based WS broker allows WS developers and providers to freely register WS and compose complex WS from other WS dynamically As a result, for each WS specification, many alternative implementations may be available.

ASTRAR Group Testing ASTRAR can test a large number of WS at both the unit and integration levels. At each level, the testing process has two phases: Phase 1: Training Phase Phase 2: Volume Testing Phase

Phase 1: Training Phase The process assumes that a reasonably large number of test inputs or test cases are available to test the concerned WS before the start of this phase 1) Select a subset of WS randomly from the set of all WS to be tested. The size of the subset will be experimentally decided. 2) Group testing: Apply each test case in the given set of test cases to test all the WS in the selected subset. 3) Voting: For each test input, the outputs from the WS under test are voted by a stochastic voting mechanism based on majority and deviation voting principles. 4) Failure detection and reliability computation: Compare the majority output with the individual output. A disagreement indicates a component failure. A dynamic reliability model is used to compute the reliability of each WS based on the failure rate and other factors.

Phase 1: Training Phase 5) Oracle establishment: If a clear majority output is found, the output is used to form the oracle of the test case that generates the output. A confident level is defined based on the extent of the majority. The confident level will be dynamically adjusted in the phase 2 as well. 6) Test case ranking: Test cases will be ranked according to their fault detection capacity, which is proportional to the number failures the test cases detect. In the phase 2, the higher ranked test cases will be applied first to eliminate the WS that failed to pass the test. 7) WS ranking: The stochastic voting mechanism will not only find a majority output, but also rank the WS under group testing according to their average deviation to the majority output.

Phase 1: Training Phase By the end of training phase testing, they have tested the selected sample WS and they have the test cases ranked by their capability so far in detecting failures; the oracle for test cases established with respect to their confidence levels the sample WS are ranked

Phase 2: Volume Testing Phase This phase continues to test the remaining WS and any newly arrived WS based on the profiles and history (test case effectiveness, oracle, and WS ranking) obtained in the training phase. Phase 2 continues to rank the WS, rank test cases, and update the oracles. 1) Test cases have been ranked by their capabilities in detecting failures/faults in Phase 1. Now they are divided into layers, with layer one having the highest capability. 2) Select layer one test cases and apply them in the next step 3) For each layer of test cases, group-test all the WS

Phase 2: Volume Testing Phase 4) If an oracle with acceptable confident level (e.g., greater than 50%) exists, no voting is necessary: Use the oracle to detect failure: Determine if each WS has produced a correct answer and then compute the failure rate and possibly the reliability of each WS using the given reliability model 5) If no oracle with acceptable confident level exists, use voting mechanism to detect failure, as described in phase 1. 6) Update the confident level of the oracles: an agreement between the oracle and the current test output increases the confident level, otherwise, decreases the confident level accordingly; 7) Update the ranking of test cases by including the new number of failures detected

Phase 2: Volume Testing Phase 8) Update the ranking of WS and eliminate the WS that have an unacceptable level of failure rate or reliability. The elimination of unnecessary testing in this step saves testing time. 9) Select next layer of test cases, and return to step 3. By the end of Phase 2 group testing: all the WS available are tested and a short list of WS are ranked; test cases are updated and ranked oracles and their confidence levels are updated The same processes can be applied at the integration testing level. If a composite WS consists of n different units of WS, ASTRAR group testing technique can be applied to this composite WS by considering each composite as an individual WS in the group testing.

Application of this Group Testing Model A real-time stock-buy-sell WS is used as an example to illustrate the application of ASTRAR technique. The WS under development consist of a server WS and multiple client WS, residing in different locations. A client can send requests to the server and the server responses to the requests. All WS under group testing implement the same specification. The WS Server offers two functions and Client WS can access these two functions. The database consists of objects of stock information, defined in the Class Stock.

Application of this Group Testing Model

Each stock object is set to an initial value at certain time point. The evaluation engine then uses randomly generated purchase and sale information, or uses replayed data from past stock dump, to decide the price dynamically once every minute. Once the price is changed, the other members (the percentages of changes in a minute, a day, a month, and a year) of each stock object are computed and updated.

Results of the application The size of the subset (training size) is critical The smaller the size is, the cheaper (fewer test runs) the testing and ranking process will be. However, the smaller the size, the higher the probability that the training phase fails to find the correct oracle An incorrect oracle will lead to incorrect ranking of the WS under test, while an incorrect ranking of test cases may result in more test runs in phase 2 of ASTRAR process. Another factor that affects the testing cost is the target size, the number of WS to be ranked. For a given large number of WS to be tested, only a short list of best WS needs to be ranked.

Results of the application proposed an efficient process to test a large number of web services designed based on the same specification The process is divided in two phases. In phase 1 (training phase), a selected number of WS is tested and their results are voted. The purpose of the first phase is to establish the oracle and identify the most powerful test cases. In the phase 2, no voting is applied and the oracle created in phase 1 is used to judge the correctness of WS under testing. Furthermore, the powerful test cases are applied first, so that the incorrect WS can be eliminated in a few tests. The experiment results reveal that the smaller the training size, the lower the cost. However, a small training size can lead to incorrect oracle, leading to incorrect WS ranking. A small training size can also lead to incorrect test case ranking, resulting a higher test cost in phase two. Therefore, it is critical to select a reasonable sized training size in WS group testing As future work explore the impact of the age of the test cases.

Conclusion They have proposed presentation proposes a Service-Oriented software Reliability Model (SORM) which generates a voted information on the fly without using an oracle a technique to test large number of WS simultaneously which uses an oracle to test the correctness of new web services

References [1] W. T. Tsai, D. Zhang, Y. Chen, H. Huang, R. Paul, N. Liao, “A Software Reliability Model for Web Services,” the 8th IASTED International Conference on Software Engineering and Applications, Cambridge, MA, November 2004, pp. 144-149. [2] W. T. Tsai, X. Wei, Y. Chen, B. Xiao, R. Paul, and H. Huang, “Developing and Assuring Trustworthy Web Services,” 7th IEEE International Symposium on Autonomous Decentralized Systems (ISADS), April 2005, pp. 43-50. [3] W. T. Tsai, X. Wei, Y. Chen, B. Xiao, R. Paul, and H. Huang, “Adaptive Testing, Oracle Generation, and Test Case Ranking for Web Services,” 29th Annual International Computer Software and Applications Conference (COMPSAC’05), 2005. [4] W.T. Tsai, Y. Chen, R. Paul N. Liao, and H. Huang, “Cooperative and Group Testing in Verification of Dynamic Composite Web Services,” in Workshop on Quality Assurance and Testing of Web-Based Applications, September 2004, pp. 170-173. [5] W.T. Tsai, Y. Chen, R. Paul, “Specification-Based Verification and Validation of Web Services and Service-Oriented Operating Systems,” Proc. of IEEE WORDS, Sedona, February 2005.

Utku ÖZBEK 2006703363. Outline Introduction Group Testing For Reliability Application of this Reliability Model Weather services application Results of.

Similar presentations

Presentation on theme: "Utku ÖZBEK 2006703363. Outline Introduction Group Testing For Reliability Application of this Reliability Model Weather services application Results of."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Utku ÖZBEK 2006703363. Outline Introduction Group Testing For Reliability Application of this Reliability Model Weather services application Results of.

Similar presentations

Presentation on theme: "Utku ÖZBEK 2006703363. Outline Introduction Group Testing For Reliability Application of this Reliability Model Weather services application Results of."— Presentation transcript:

Similar presentations

About project

Feedback