Computational Resiliency Steve J. Chapin, Susan Older Center for Systems Assurance Syracuse University Gregg Irvin Mobium Enterprises 24 July 2001Not for.

Slides:



Advertisements
Similar presentations
COURSE: COMPUTER PLATFORMS
Advertisements

Multi-Mode Survey Management An Approach to Addressing its Challenges
A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager
KAIS T The Vision of Autonomic Computing Jeffrey O. Kephart, David M Chess IBM Watson research Center IEEE Computer, Jan 발표자 : 이승학.
Silberschatz and Galvin  Operating System Concepts Module 16: Distributed-System Structures Network-Operating Systems Distributed-Operating.
Distributed System Structures Network Operating Systems –provide an environment where users can access remote resources through remote login or file transfer.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 6 Managing and Administering DNS in Windows Server 2008.
High Performance Computing Course Notes Grid Computing.
Chapter 4: Threads. 4.2 Silberschatz, Galvin and Gagne ©2005 AE4B33OSS Chapter 4: Threads Overview Multithreading Models Threading Issues Pthreads Windows.
Threads. Objectives To introduce the notion of a thread — a fundamental unit of CPU utilization that forms the basis of multithreaded computer systems.
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
Threads.
Transaction.
Chapter 19: Network Management Business Data Communications, 4e.
Distributed Processing, Client/Server, and Clusters
Network Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Logging into the.
Chapter 5 Processes and Threads Copyright © 2008.
Chapter 4: Threads. 4.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th edition, Jan 23, 2005 Chapter 4: Threads Overview Multithreading.
Chapter 4: Threads. 4.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th edition, Jan 23, 2005 Chapter 4: Threads Overview Multithreading.
Improving Robustness in Distributed Systems Jeremy Russell Software Engineering Honours Project.
Learning Objectives Understanding the difference between processes and threads. Understanding process migration and load distribution. Understanding Process.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
Chapter 4: Threads READ 4.1 & 4.2 NOT RESPONSIBLE FOR 4.3 &
Chapter 4: Threads. 4.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 4: Threads Overview Multithreading Models Threading Issues.
Course Instructor: Aisha Azeem
Chapter 2 TCP/ IP PROTOCOL STACK. TCP/IP Protocol Suite Describes a set of general design guidelines and implementations of specific networking protocols.
14 Feb 2001 OASIS PI Meeting Computational Resiliency Steve J. Chapin, Susan Older Syracuse University Gregg Irvin Mobium Enterprises.
Towards a Logic for Wide- Area Internet Routing Nick Feamster Hari Balakrishnan.
Using Windows Firewall and Windows Defender
Self Adaptivity in Grid Computing Reporter : Po - Jen Lo Sathish S. Vadhiyar and Jack J. Dongarra.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Lecture 4 Operating Systems.
Chapter 4: Threads. 4.2 Chapter 4: Threads Overview Multithreading Models Threading Issues Pthreads Windows XP Threads Linux Threads Java Threads.
Silberschatz, Galvin and Gagne ©2011Operating System Concepts Essentials – 8 th Edition Chapter 4: Threads.
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
20 July 2000 DARPA IA&S Joint PI Meeting Computational Resiliency Steve J. Chapin, Susan Older Syracuse University Gregg Irvin Mobium Enterprises.
SAMANVITHA RAMAYANAM 18 TH FEBRUARY 2010 CPE 691 LAYERED APPLICATION.
Crossing The Line: Distributed Computing Across Network and Filesystem Boundaries.
Architectures of distributed systems Fundamental Models
Distributed Database Systems Overview
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.
Effort.vs. Software Product “Quality” Effort Product “Quality” Which curve? - linear? - logarithmic? - exponential?
Chapter 2 Processes and Threads Introduction 2.2 Processes A Process is the execution of a Program More specifically… – A process is a program.
Claims-Based Identity Solution Architect Briefing zoli.herczeg.ro Taken from David Chappel’s work at TechEd Berlin 2009.
Chapter 4: Threads.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 4: Threads.
Lecture 13.  Failure mode: when team understands requirements but is unable to meet them.  To ensure that you are building the right system Continually.
Chapter 5 System Modeling. What is System modeling? System modeling is the process of developing abstract models of a system, with each model presenting.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
State of Georgia Release Management Training
Operating Systems Distributed-System Structures. Topics –Network-Operating Systems –Distributed-Operating Systems –Remote Services –Robustness –Design.
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
Introduction to threads
Self Healing and Dynamic Construction Framework:
Storage Virtualization
Chapter 3: Windows7 Part 1.
Chapter 4: Threads.
Distributed Systems Bina Ramamurthy 11/30/2018 B.Ramamurthy.
The Vision of Autonomic Computing
Architectures of distributed systems Fundamental Models
Architectures of distributed systems Fundamental Models
SAMANVITHA RAMAYANAM 18TH FEBRUARY 2010 CPE 691
Architectures of distributed systems
Distributed Systems Bina Ramamurthy 4/22/2019 B.Ramamurthy.
Architectures of distributed systems Fundamental Models
Presentation transcript:

Computational Resiliency Steve J. Chapin, Susan Older Center for Systems Assurance Syracuse University Gregg Irvin Mobium Enterprises 24 July 2001Not for Public Release

Computational Resiliency – CSA Recap: What is Computational Resiliency? The ability to sustain application operation and dynamically restore the level of assurance during an attack. Application-centric self defense, built on replication, migration, functionality mutation, and camouflage.

Computational Resiliency Mission Critical Application Attack Degraded Application sufficiently Improved by Resiliency to perform Mission Critical Function Techniques applied to correct situation Computational Resiliency Result of Attack Degraded Application trying to perform Mission Critical Function

Not for Public Release Computational Resiliency – CSA Multi-Faceted Approach  Theoretical framework  reason about conformance to policy  Computational resiliency library  dynamic application management  System software support  scheduling/policy frameworks

Not for Public Release Computational Resiliency – CSA Computational Resiliency Library  Dynamic multithreading  Migration  Replication  Camouflage  Functionality reconfiguration  Policy-based management

Example of CRLib “Safe Zone” OASIS protection “The Wild” limited protection

The Benign State Dudley’s job (low priority) Bullwinkle’s jobRocky’s job

The Attacks Snidely attacks: blocked at firewall Dudley does nothing.

The Attacks Natasha attacks Rocky; caught by IDS.

The Attacks Rocky’s job migrates back into safe zone; Dudley must give up resources.

The Attacks Boris attacks Bullwinkle’s job. Some attacks succeed.

The Attacks Bullwinkle’s job employs camouflage, decoys, and migration.

Groups and Replication Group Processor  One group per computational task  User selects replication level, other policies  Group mapped across processors  Periodic liveness checks

Not for Public Release Computational Resiliency – CSA Theory Framework: Goals  Understand the interplay among core aspects of CRLib  Groups, locations, resources, schedules, …  Reason about effects of configuration and policy choices  Reason about applications’ conformance to desired behavior

Not for Public Release Computational Resiliency – CSA Framework Basics  Build on existing mobile calculi   -Calculus, Mobile Ambients, Join-Calculus  Capture essential features of CRLib  Replication  Migration  Reconfiguration  Camouflage

Not for Public Release Computational Resiliency – CSA A  -Calculus Primer  Collection of names  Represent information: values, communication links (channels), code  Have scope  Message-based communication receipt of a value on x transmission of y along x  Information mobility: information can be passed beyond original scope

Not for Public Release Computational Resiliency – CSA Finding a Service Provider Client wants to find a service provider: 1.Query the Service Directory, include a SASE. 2.Wait for response. 3.Upon receipt, submit request.

Not for Public Release Computational Resiliency – CSA Handling Service Requests  Service Directory repeatedly responds to queries, arbitrarily choosing provider.  Service providers wait for requests.

Not for Public Release Computational Resiliency – CSA b c a query

Not for Public Release Computational Resiliency – CSA b c a addr a b c

Not for Public Release Computational Resiliency – CSA b c a b

Not for Public Release Computational Resiliency – CSA b c a

Not for Public Release Computational Resiliency – CSA Initial Questions  What are the primary entities, as well as the relationships among them?  Groups, locations, failures  External events: DEFCON changes  Scheduling policies  Application policies  What is the most appropriate way to integrate those components?  And at what abstraction level?

Not for Public Release Computational Resiliency – CSA In Progress: Two Calculi  Higher-level calculus that incorporates the CRLib API  Captures groups, policies, etc.  Lower-level calculus that provides semantics for higher-level calculus  Captures abstract implementation details. Soundness of the translation will provide validation.

Not for Public Release Computational Resiliency – CSA A Thought Experiment Suppose there are two tasks, A and B, working in parallel:  A’s replication level: 4  B’s replication level: 2  Three processors: P1 P2 P3 Resulting behavior (modulo robustness) should be similar to system with single copies of A and B.

Not for Public Release Computational Resiliency – CSA Open Questions  How do we define “similar”, much less prove it?  Correctness  Performance  Robustness  What are sufficiently high-level yet informative performance measures?  How to model camouflage?

Not for Public Release Computational Resiliency – CSA Back to CRLib: Status  Multiple platforms  Windows NT/2000, Linux, SGI IRIX, Solaris  Heterogeneous resource management methods  Load-balancing across heterogeneous networks  Performance improvement by factor of 3  Demo this evening

Not for Public Release Computational Resiliency – CSA In Progress  Adding support for Byzantine failures  User-level option for authenticated messages  Based on Lamport-Shostak-Pease algorithms  Greater resiliency needed for nonauthenticated messages  Evaluating cost of replication  Compare to standard checkpointing

Not for Public Release Computational Resiliency – CSA Next Steps for Project  Tool for user policy expression  Choices for replication/recovery methods, agreement protocols, message-passing schemes  State-dependent policy specified via “chinese menu” approach  Scheduling framework  Schedulers that understand CR policies, resulting resource demands, user/process priorities  Build on previous MESSIAHS and Legion work  Finalize core CR calculi; turn to analysis techniques

Not for Public Release Computational Resiliency – CSA Open Issues  Cost/benefit analysis of CR  How much protection do we provide if the attacker knows what we’re trying to do?  How much is performance affected by message load, active replication, etc. ?  Potential integration with other OASIS projects