Presentation is loading. Please wait.

Presentation is loading. Please wait.

December 5, 20031 Mars and Beyond: NASA’s Software Challenges in the 21st Century Dr. Michael R. Lowry NASA Ames Research Center.

Similar presentations


Presentation on theme: "December 5, 20031 Mars and Beyond: NASA’s Software Challenges in the 21st Century Dr. Michael R. Lowry NASA Ames Research Center."— Presentation transcript:

1 December 5, 20031 Mars and Beyond: NASA’s Software Challenges in the 21st Century Dr. Michael R. Lowry NASA Ames Research Center

2 December 5, 20032 Outline NASA’s Mission Role of Software within NASA’s Mission The Challenge: Enable Dependable SW-based Systems Technical Challenges –Scaling ! –System-software barrier –Software is opaque and brittle in the large Reasons for Optimism

3 December 5, 20033 NASA’s Vision To improve life here To extend life to there To find life beyond NASA’s Mission To understand and protect our home planet To explore the universe and search for life To inspire the next generation of explorers …as only NASA can 5 Strategic Enterprises One NASA Space Science Earth Science Biological & Physical Research HEDS Aerospace Technology

4 December 5, 20034 Software Growth in Aerospace Missions Software Enables NASA’s Missions 19601965197019751980198519901995 1 10 100 1,000 10,000 Instructions (Equivalent Memory Locations in K) Year Source: AF Software Technology Support Center TITAN VENUS MERCURY PERSHING 1 SURVEYOR PERSHING 1A POSEIDON C3 TITAN 111C VIKING PERSHING 11 GALILEO MISSILE PERSHING 11 (AO) TRIDENT C4 VOYAGER MARINER Unpiloted Systems C-17 PROJECTED C-5A A7D/E F-111 P-3A B-1A AWACS B-1B F-15E B-2 F-16 C/D F-22 (PROJECTED) F-111 GEMINI 3 Piloted Systems GEMINI 3 APOLLO 7 GEMINI 8 SHUTTLE/OFT MERCURY 3 SHUTTLE/ OPERATIONAL (Doubling every 3 or 4 years)

5 December 5, 20035 The Challenge: Software Risk Factors

6 December 5, 20036 Mars Climate Orbiter Launched –11 Dec 1998 Mission –interplanetary weather satellite –communications relay for Mars Polar Lander Fate: –Arrived 23 Sept 1999 –No signal received after initial orbit insertion Cause: –Faulty navigation data caused by failure to convert imperial to metric units

7 December 5, 20037 MCO Events Locus of error –Ground software file called “Small Forces” gives thruster performance data –This data is used to process telemetry from the spacecraft Spacecraft signals each Angular Momentum Desaturation (AMD) maneuver Small Forces data used to compute effect on trajectory Software underestimated effect by factor of 4.45 Cause of error –Small Forces Data given in Pounds-seconds (lbf-s) –The specification called for Newton-seconds (N-s) Result of error –As spacecraft approaches orbit insertion, trajectory is corrected Aimed for periapse of 226km on first orbit –Estimates were adjusted as the spacecraft approached orbit insertion: 1 week prior: first periapse estimated at 150-170km 1 hour prior: this was down to 110km Minimum periapse considered survivable is 80km –MCO entered Mars occultation 49 seconds earlier than predicted Signal was never regained after the predicted 21 minute occultation Subsequent analysis estimates first periapse of 57km

8 December 5, 20038 Contributing Factors First 4 months, AMD data unusable due to file format errors –Navigators calculated the data by hand –File format fixed by April 1999 –Anomalies in the computed trajectory became apparent almost immediately Limited ability to investigate the anomalies: –Thrust effects measured along Earth-spacecraft line of sight using doppler shift –AMD thrusts are mainly perpendicular to line of sight Failure to communicate between teams: –E.g. Issue tracking system not properly used by navigation team Anomalies were not properly investigated Inadequate staffing –Operations team were monitoring three missions simultaneously (MGS, MCO and MPL) Operations Navigation team unfamiliar with spacecraft –Different team from the development and test team –This team did not fully understand the significance of the anomalies –Assumed familiarity with previous mission (Global Surveyor) was sufficient: did not understand why AMD was performed 10-14 times more often (MCO has asymmetric solar panels, whereas MGS had symmetric panels) Inadequate Testing –Software Interface Specification was not used during unit testing of small forces software –End-to-end test of ground software was never completed –Ground software was not considered “mission critical” so didn’t have independent V&V Inadequate Reviews –Key personnel missing from critical design reviews

9 December 5, 20039 Analysis Software size, S, increasing exponentially (doubling every three or four years) Errors, cost over-runs, schedule slip due primarily to non-local dependencies during integration (S N, with N<2, best calibration: N=1.2 ) Source: Professor Barry Boehm, Author of Software Cost Modeling SW Size Errors

10 December 5, 200310 Predicted Errors as LOC Grows: Current SW Practices/Technology Errors = e  S N ; where S is the number of modules (LOC/M), and error rate e = 1/10,000 CassiniMPL

11 December 5, 200311 Future Mars Exploration: MSL and MSR

12 December 5, 200312 Beyond Mars: JIMO and TPF

13 December 5, 200313 Technical Challenges and Opportunities System-software barrier –(Verification is easy, validation is hard) Software is transparent and malleable in the small… –But opaque and brittle in the large General-purpose software dependability tools work well in the small –But fail to scale to systems in the large. But there is Reason for Optimism Align software architectures with system analysis Success of formal methods in related field of digital hardware Scaling through specialization Divide and Conquer: compositional reasoning Beyond correctness: exploiting the lattice between true and false for software understanding Providing the research community with realistic experimental testbeds at scale

14 December 5, 200314 Scaling through Specialization: Practical Static Analysis PolySpace C-Verifier C Global Surveyor (NASA Ames) DAEDALUS Coverity Scalability Precision 1 MLoc 500 KLoc 50 KLoc 80% 95% GENERAL-PURPOSE ANALYZERS SPECIALIZED ANALYZERS

15 December 5, 200315 void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; } Code with Transient Error Hard to Show Error Testing cannot reliably show the error appearing, since it may require specific environment actions (inputs) or scheduling (for concurrency errors) Hard to Find Cause of the Error Once we know a way to show the error it is difficult to localize the root cause of the error + void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; } Software Model Checker JPF void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; } Produces Error Trace Error Explanation Localize Cause of the Error A model checker can automatically find a trace that show the error appearing Now we can automatically find an explanation for the error from the error trace produced by the model checker and the original program The algorithm uses model checking to first find similar traces that also cause the error (negatives) and traces that do not cause the error (positives) Set of Positives Traces that don’t show the error Set of Negatives Traces that show different versions of the error Analysis Explaining the Cause of an Error 1. Source code similarities to explain control errors code that appear only in negatives all negatives, and, only and all negatives (causal) 2. Data invariants – explains errors in data 3. Minimal transformations to create a negative from a positive – show the essence of an error

16 December 5, 200316 ( = “unknown yet”) Our novel symbolic execution framework: - extends model checking to programs that have complex inputs with unbounded (very large) data - automates test input generation Generalized Symbolic Execution for Model Checking and Testing input program Model checking Program instrumentation Decision procedures instrumented program correctness specification continue/ backtrack counterexample(s)/ test suite [heap+constraint+thread scheduling] Framework: Future mission software: - concurrent - complex, dynamically allocated data structures (e.g., lists or trees) - highly interactive: - with complex inputs - large environment - should be extremely reliable - testing: - requires manual input - typically done for a few nominal input cases - not good at finding concurrency bugs - not good at dealing with complex data structures void Executive:: startExecutive(){ runThreads(); …} void Executive:: executePlan(…) { while(!empty) executeCurrentPlanNode(); } … Rover Executive execute action environment/ rover status Input plan complex input structure large environment data concurrency, dynamic data (lists, trees) - model checking: - automatic, good at finding concurrency bugs - not good at dealing with complex data structures - feasible only with a small environment - and a small set of input values Current practice in checking complex software: - modular architecture: can use different model checkers/decision procedures class Node { int elem; Node next; Node deleteFirst() { if (elem < 10) return next; else if (elem < 0) assert(false); … } } Code Analysis of “deleteFirst” with our framework -“simulate” the code using symbolic values instead of program data; enumerate the input structures lazily e0 ≥ 10 /\ e0<0 e0 < 10e0 ≥ 10 truefalse true FALSE Precondition: acyclic list Numeric Constraints Decision Procedures Structural Constraints Lazy initialization+ Enumerate all structures e0 null e0 e1

17 December 5, 200317 System-Level Verification check (system-level) integration properties based on module specifications module hierarchy and interfaces used for incremental abstraction architectural patterns potentially reusable generate module/environment assumptions check implementation modules against their design specifications monitor properties that cannot be verified monitor environment assumptions

18 December 5, 200318 Module Verification Modules may require context information to satisfy a property Assumption || Module ╞ Property (assume – guarantee reasoning) Environment Module a b c Property Assumption Developer encodes them Abstractions of environment, if known how are assumptions obtained? Automatically generate exact assumption A –for any environment E (E || Module ╞ Property) IFF E╞ A Demonstrated on Rover example Automated Software Engineering 2002

19 December 5, 200319 Mission Manager Viewpoint Asking the Right Questions When can we stop testing? What process should we use? What is the value of formal methods? Qualitative Correlative Model Peer Review superior to testing for incorrect spec Model Checking for uncertain environments Quantitative Predictive Model Mission trade studies: how much cost for acceptable risk Development: optimize use of Assurance technologies Mission: increase use of CPU cycles for software monitoring

20 December 5, 200320 HDCP Goals The overall mission of the HDCP project is to increase the ability of NASA to engineer highly dependable software systems Method: –Science of Dependability: Develop better ways to measure and predict software dependability –What are the potential measurables for the various attributes? –How can we move past the present surrogates and approach the artifact more directly? –Empirical evaluation of NASA and NASA-contractor dependability problems of technologies and engineering principles to address the problems –Testbeds Development of realistic testbeds for empirical evaluation of technologies and attributes. –Intervention technologies

21 December 5, 200321 Active MDS Testbed Projects Golden Gate Project –Demonstrate that RT-Java is suitable for mission systems –Drive MDS/RTSJ rover at JavaOne –Collaborators: Champlin, Giovannoni SCRover Project –Develop rover testbed –Collection defect and process data for experience base –Collaborators: Boehm, Madachy, Medvidovic, Port Dependability cases –Develop dependability cases for time management and software architectures –Collaborators: Goodenough, Weinstock, Maxion, Hudak Analysis of MDS architectural style –Analysis based on MDS use architectural-components types –Collaborators: Garlan Process improvement –Data collection from mainline MDS and SCRover development efforts –Collaborators: Johnson, Port

22 December 5, 200322 MDS in 1 Minute Approach Product line practice to exploit commonalities across missions: Œ An information and control architecture to which missions/products conform  A systems engineering process that is analytical, disciplined, and methodical Ž Reusable and adaptable framework software Problem Domain Mission information, control, and operations of physical systems Developed for unmanned space science missions Scope includes flight, ground and simulation/test Applicable to robots that operate autonomously to achieve goals specified by humans Architecturally suited for complex systems where “everything affects everything” MDS Products Unified flight, ground and test architecture Orderly systems engineering methodology Frameworks (C++ and Java) Processes, tools, and documentation Examples Reusable software

23 December 5, 200323 Component-Based Architecture Handles interactions among elements of the system software Inward looking Addresses software engineering issues State-Based Architecture Handles interactions among elements of the system under control Outward looking Addresses systems engineering issues Managing Interactions Complex interactions make software difficult –Elements that work separately often fail to work together –Combinatorics of interaction is staggering, so it’s not easy to get right –This is a major source of unreliability There are two approaches to this in MDS: “A unified approach to managing interactions is essential”

24 December 5, 200324 MDS is… State-Based Architecture State variables hold state values, including degree of uncertainty Estimators interpret measurement and command evidence to estimate state Controllers issue commands, striving to achieve goals Hardware proxies provide access to hardware busses, devices, instruments Models express mission- specific relations among states, commands, and measurements A goal is a constraint on the value of a state variable over a time interval Key Features:  Systems analysis/design organized around states and models  State control architecturally separated from state determination  System operated via specifications of intent: goals on state

25 December 5, 200325 From theory to flight... JPL Transition Path Mars Smart Lander (MSL) Technology Infusion –Scheduled Launch: 2009 –MSL has baselined MDS technology System engineering Software frameworks –MSL Technology Gates PMSRAugust, 2004 Integrated demoJune, 2005 PDRFebruary, 2006 MSL sample technology categories –Software architecture with infused technologies –Verification and Validation tools and methodologies –Processes and supporting tools –Cost modeling for system engineering, software adaptation and autonomy validation MDS compatible technologies are directly relevant to MSL

26 December 5, 200326 Conclusions System-software barrier –(Verification is easy, validation is hard) Software is transparent and malleable in the small… –But opaque and brittle in the large General-purpose software dependability tools work well in the small –But fail to scale to systems in the large. But there is Reason for Optimism Align software architectures with system analysis Success of formal methods in related field of digital hardware Scaling through specialization Divide and Conquer: compositional reasoning Beyond correctness: exploiting the lattice between true and false for software understanding Providing the research community with realistic experimental testbeds at scale


Download ppt "December 5, 20031 Mars and Beyond: NASA’s Software Challenges in the 21st Century Dr. Michael R. Lowry NASA Ames Research Center."

Similar presentations


Ads by Google