IV&V Facility 1 Using Fractal Analysis to Monitor and Model Software Aging Mark Shereshevsky, Bojan Cukic, Jonathan Crowell, Vijai Gandikota West Virginia.

Slides:



Advertisements
Similar presentations
Chapter 6: Memory Management
Advertisements

Operating Systems An operating system is a set of programs that controls how the hardware of a computer works. An operating system provides a means of.
1 Characterization of Software Aging Effects in Elastic Storage Mechanisms for Private Clouds Rubens Matos, Jean Araujo, Vandi Alves and Paulo Maciel Presenter:
G. Alonso, D. Kossmann Systems Group
CS 795 – Spring  “Software Systems are increasingly Situated in dynamic, mission critical settings ◦ Operational profile is dynamic, and depends.
STAT 497 APPLIED TIME SERIES ANALYSIS
Mark Shereshevsky, Bojan Cukic Fractal Analysis of Resource Exhaustion in Operating Systems: Initial Steps Mark Shereshevsky, Bojan Cukic Jonathan Crowell.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 11: Monitoring Server Performance.
ANOMALY DETECTION AND CHARACTERIZATION: LEARNING AND EXPERIANCE YAN CHEN – MATT MODAFF – AARON BEACH.
Virtual Memory:Part 2 Kashyap Sheth Kishore Putta Bijal Shah Kshama Desai.
11 MONITORING MICROSOFT WINDOWS SERVER 2003 Chapter 3.
MCITP Guide to Microsoft Windows Server 2008 Server Administration (Exam #70-646) Chapter 14 Server and Network Monitoring.
Bandwidth Allocation in a Self-Managing Multimedia File Server Vijay Sundaram and Prashant Shenoy Department of Computer Science University of Massachusetts.
Fault Prediction and Software Aging
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
Rensselaer Polytechnic Institute CSC 432 – Operating Systems David Goldschmidt, Ph.D.
Research Heaven, West Virginia Verification and Validation of Adaptive Systems Online Failure Detection and Identification for IFCS through Statistical.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Verification & Validation
1. Memory Manager 2 Memory Management In an environment that supports dynamic memory allocation, the memory manager must keep a record of the usage of.
11 SYSTEM PERFORMANCE IN WINDOWS XP Chapter 12. Chapter 12: System Performance in Windows XP2 SYSTEM PERFORMANCE IN WINDOWS XP  Optimize Microsoft Windows.
This document is for informational purposes only, and Tekelec reserves the right to change any aspect of the products, features or functionality described.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Utilizing Call Admission Control for Pricing Optimization of Multiple Service Classes in Wireless Cellular Networks Authors : Okan Yilmaz, Ing-Ray Chen.
1 Tuning Garbage Collection in an Embedded Java Environment G. Chen, R. Shetty, M. Kandemir, N. Vijaykrishnan, M. J. Irwin Microsystems Design Lab The.
1 Virtual Memory Chapter 9. 2 Resident Set Size n Fixed-allocation policy u Allocates a fixed number of frames that remains constant over time F The number.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
Virtual Memory Virtual Memory is created to solve difficult memory management problems Data fragmentation in physical memory: Reuses blocks of memory.
Research Heaven, West Virginia FY2003 Initiative: Hany Ammar, Mark Shereshevsky, Walid AbdelMoez, Rajesh Gunnalan, and Ahmad Hassan LANE Department of.
Computer Systems Week 14: Memory Management Amanda Oddie.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
Virtual Memory The memory space of a process is normally divided into blocks that are either pages or segments. Virtual memory management takes.
Automatic Statistical Evaluation of Resources for Condor Daniel Nurmi, John Brevik, Rich Wolski University of California, Santa Barbara.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
Disk Failures Eli Alshan. Agenda Articles survey – Failure Trends in a Large Disk Drive Population – Article review – Conclusions – Criticism – Disk failure.
A BRIEF INTRODUCTION TO CACHE LOCALITY YIN WEI DONG 14 SS.
Windows Server 2003 系統效能監視 林寶森
Performance Testing Test Complete. Performance testing and its sub categories Performance testing is performed, to determine how fast some aspect of a.
Research Heaven, West Virginia PI: Katerina Goseva – Popstojanova Students: Ajay Deep Singh & Sunil Mazimdar Lane Dept. Computer Science and Electrical.
Memory Management OS Fazal Rehman Shamil. swapping Swapping concept comes in terms of process scheduling. Swapping is basically implemented by Medium.
Programming in Alice IT-IDT-9 Design, develop, test and implement programs using visual programming. 9.1 Utilize drag and drop software to develop programs.
1 Chapter Overview Monitoring Access to Shared Folders Creating and Sharing Local and Remote Folders Monitoring Network Users Using Offline Folders and.
COMP091 – Operating Systems 1 Memory Management. Memory Management Terms Physical address –Actual address as seen by memory unit Logical address –Address.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Virtual Memory.
Real-time Software Design
OPERATING SYSTEMS CS 3502 Fall 2017
Memory Management.
Chapter 2 Memory and process management
Hands-On Microsoft Windows Server 2008
Day 23 Virtual Memory.
Day 24 Virtual Memory.
MONITORING MICROSOFT WINDOWS SERVER 2003
Real-time Software Design
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 8 11/24/2018.
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 9 12/1/2018.
So far… Text RO …. printf() RW link printf Linking, loading
Main Memory Background Swapping Contiguous Allocation Paging
Chapter 8: Memory management
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Unit OS5: Memory Management
Image and Video Processing
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 9 4/5/2019.
Uniprocessor scheduling
Virtual Memory: Policies (Part II)
CPSC 641: Network Traffic Self-Similarity
Presentation transcript:

IV&V Facility 1 Using Fractal Analysis to Monitor and Model Software Aging Mark Shereshevsky, Bojan Cukic, Jonathan Crowell, Vijai Gandikota West Virginia University (WVU UI: Fractal Study of Resource Dynamics in Real Time Operating Systems)

IV&V Facility 2 Overview Introduction and motivation Fractality of resource utilization measures in operating systems Modeling software aging Experimental results Summary

IV&V Facility 3 Introduction “Software aging" phenomenon implies that the state of the software system degrades with time. The degradation manifests itself in performance decline (excessive paging and swapping activity etc.), possibly leading to crash/hang failures or both. Degradation is caused, in particular, by the exhaustion of the operating system resources, such as the number of unused memory pages, the number of disk blocks available for page swapping, etc.

IV&V Facility 4 Earlier Studies of Resource Exhaustion Vaidyanathan and Trivedi describe the behavior of operating system recourses as a function of time. Slope (trend) depends on the workload state of the system. Workload dynamics is modeled as semi-Markov process. In many workload states the dynamics of the resources demonstrates very high variance resulting in very broad confidence intervals. The highly irregular and oscillatory behavior of the data makes most trend model insufficient.

IV&V Facility 5 Our Research Objectives Investigate correlation between fractal properties of the resource data and the system’s workload. Develop fractal-based model of the resource exhaustion process. Apply it to real-time operating systems. Investigate possibility of using such model for predicting system outages and for preventive maintenance planning.

IV&V Facility 6 Goal of the Study Can resource exhaustion be predicted? –Interested in monitoring approaches, suitable for NASA deep space probes. Can fractal theory help? –Does system usage dynamics display fractal behavior over time? –Analyze patterns of fractality in OS resources and establish connection with the resource exhaustion.

IV&V Facility 7 Initial Data Collection: Memory Resources sml_mem - mem reserved for small requests lg_mem- mem reserved for large requests sml_alloc - mem allocated for small requests lg_alloc- mem allocated for large requests freemem- pages of free memory freeswap- swap space on disk First data collected from a department’s sun server, Sept Sept 22, 2001

IV&V Facility 8 Fractality of Memory Resources Can this be used to predict a system crash ? Can this be used to predict a system crash ?

IV&V Facility 9 HÖlder Exponent of a Function HE characterizes the degree of local “burstiness” (fractality) of the function. The lower (closer to 0) the HE, the “wilder” the local oscillations. For a smooth function HE = 1 (or higher).

IV&V Facility 10 Plots of Data With Hölder Exponent realMemoryFree data from SUN server (high workload); Hölder exponents for the data sets.

IV&V Facility 11 Hölder Exponent Hystogram: An Example The histogram of Hölder exponent for realMemoryFree (high workload).

IV&V Facility 12 Recent Data Collection Windows 2000 system stress tool used. 2 computers networked together, –One barraged the other with workload. The stress load was increased until a crash occurred.

IV&V Facility 13 Selecting Parameters for Monitoring Over a hundred OS parameters monitored. We selected the three which: –Do not have smooth or locally constant behavior; –Do not represent “per-unit-of-time” quantity (such as system_calls_per_sec ); –Do not have very high (over 0.9) mutual correlations. Selected parameters (resources): –Available_bytes; –Pool_paged-allocs; –System_cache_resident_bytes. We combine the parameters into a 3-dimensional “resource vector” and monitor its fractal dynamics.

IV&V Facility 14 Recent Experiments: Some Plots Available Bytes, Pool Paged Allocs, Sys Cache Resident Bytes, and Multi-dimensional Hölder exponent

IV&V Facility 15 Observations and Hypotheses As the stress increases, HÖlder exponent decreases (fractality increases). The decrease of HÖlder exponent may be viewed as quantitative measure of resource exhaustion. Fractality tends to change in jumps. –Most of our experiments show two noticeable drops in HÖlder exponent before crash occurs.

IV&V Facility 16 Multidimensional Hoelder Exponents

IV&V Facility 17 Can Crashes Be Anticipated? Conjecture: the second “fractal jump” observed during the system’s operation signals a dangerous level of resource exhaustion which may lead to crash. However, there is still enough time for graceful shutdown of system. Problems: Detection of “jumps” in noisy HE signal. What is optimal shutdown time strategy (shut it down immediately? Let the system run? For how long?).

IV&V Facility 18 Fractal Jump Detection The problem belongs to the field of change detection methods. It amounts to detecting a sharp change (decrease) in the mean value of a noisy time series. We view the HE time series as a piecewise constant signal plus a Gaussian noise. We utilized the classical Shewhart control chart algorithm modified to fit our situation (e.g. unlike classical case, we estimate the mean of the signal in real time).

IV&V Facility 19 Automatic Detection of “Fractal Jumps” The HE plots with pink lines indicating “fractal jumps”.

IV&V Facility 20 Summary Is the “theory of the 2 nd fractal jump” viable? –How long does the system have to live after the 2 nd jump? –Develop a strategy for automatic preventive shut-down of the system based on the “fractal jumps” detection. Collect more and “better” data. –Allow load increases and decreases. Explore the possibility to incorporate other parameters into the analysis framework. Port the analysis into a real-time environment. –NASA simulated testbeds, ARTS II processor (ISR).