Presentation is loading. Please wait.

Presentation is loading. Please wait.

Systems Architecture for Statistical Applications: Introduction and Overview Andrew Westlake Survey & Statistical Computing Wednesday 25 th January 2006.

Similar presentations


Presentation on theme: "Systems Architecture for Statistical Applications: Introduction and Overview Andrew Westlake Survey & Statistical Computing Wednesday 25 th January 2006."— Presentation transcript:

1 Systems Architecture for Statistical Applications: Introduction and Overview Andrew Westlake Survey & Statistical Computing Wednesday 25 th January 2006

2 25/1/2006RSS/ASC Systems Architecture: Introduction and OverviewIntroduction Systems Architecture for Statistical Applications  Not Features or Usability Long-term issues that affect Statistical Systems  Ease of maintenance and enhancement  Responsiveness to developments in operating environments  Portability between computing environments  Interoperability with other related systems  Extensions by Users Programme  Papers from developers of statistical systems  Describing different approaches  Discussing problems and solutions

3 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview Some Issues Statistical software has a small market  Limited development budgets  Early design and implementation decisions can be critical  Re-engineering is a major step Statistical Software is different  Provides functionality for solving (a class of) problems  Not automation of tasks  More generalised than traditional application design Need to exploit ideas and developments  Objects, components, standards, services, … *  Open source, Windows, Linux, Internet  Data warehouses & OLAP, Data mining, … Levels of Abstraction/Generalisation  Different levels needed at different times in design and discussion  Confusion often due to discussion at the wrong (or different) levels

4 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview Object-Oriented Design Alternative way of thinking about software structure  An abstract model of programming  Developed in ’60’s and ’70’s Greater Reliability, Ease of Maintenance  Objects have behaviour and own data  Avoidance of ‘side-effects’ Compiler and Run-time system support  C++, Java, VB(?) … Big influence on design of S Academic and Commercial input  Ideas and concepts from abstract work by academics  Developed, extended and realised by commercial developers

5 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview The Object Paradigm Objects are Instances of Classes  Classes define shared structure (attributes) and behaviour (methods)  Objects have Identity, Information and State (attribute values)  Created and destroyed dynamically at run time, can be persistent Encapsulation  Objects receive Messages invoking Behaviour >Includes changing and returning attribute values  Can only access the attributes of an object through its public methods Inheritance  New classes can be defined as specialisations of others  Inherit structure and methods, but can alter and extend Polymorphic Methods  Methods behave differently for different classes, so response depends on type of object receiving message >E.g object knows how to Display itself  Object sending message does not need to worry (much)

6 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview System Modelling Methodologies: UML Need recognised for systematic design and development methods  Management of complexity  Identification and control of requirements  Ease of maintenance  Feedback and validation from Users Various conflicting systems proposed Task force of Object Management Group: OMG Produced the Unified Modelling Language: UML  Rumbaugh, Jacobson and Booch  Supports design from User Requirements to Code Production Development Methodologies built around UML  Agile-, Extreme-, Feature-Driven-, Iterative-, Unified-, … Development

7 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview UML Features Formal specification of Language and Semantics for design of systems (now version 2.0) Includes formalised diagram types and elements  Activity, Class, Component, Deployment, Sequence, State, Use Case, … Diagrams  Aggregation, Generalisation, Cardinality, Classification, Concurrency, Constraints, Dependency, Interfaces, Synchronicity, Visibility, … elements, attributes, facets Various packages support complete development from design to code generation  Poseidon, Rational (IBM), Together (Borland), Visual Studio, …  Essentially independent of implementation language* Can be used informally for early design stages (e.g. Visio) Difficult to learn thoroughly  Good overview in UML Distilled, Martin Fowler (A-W, 2004)  Not perfect – some areas under-developed, some omissions  No alternative is as well established or supported

8 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview A UML Class Diagram

9 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview Interfaces and Components Formal definition of Interfaces is an aspect of Encapsulation  Straight forward within a single system  Improves robustness of the system Idea extended to distributed components and systems  Independent components on the same system, eg COM objects, Active-X  Servers and Clients on same or different systems, eg database servers (ODBC), web servers (HTML), distributed data archives (RDF)  Distributed processing on specialised servers, eg DCOM, Web Services, Grid Difficult issues for management of communication channels  Message language, message structure and protocol, service discovery  All being resolved through industry collaboration building on academic ideas

10 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview Distributed Architecture Construct system from components that communicate through messages  May be remote – message security and transport handled by Internet (for example) Use the best components for the job, only develop the bits no one else does  For example, use SQL Server for data store, with access control, Apache to deliver displays to users, R for statistical calculations and charts, …  Can distribute almost anything: processing power, algorithms, data, knowledge, metadata, … Benefits  Cheaper – you only have to build your bits  Better – get the best products for the other bits Problems  Overheads in communication – can be avoided with clever design  Have to agree on message mechanisms – or follow a standard  Cost of other components – but many are effectively free The future of Computing Systems

11 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview XML – eXtensible Markup Language Markup Language  Text with Tags ( field contents ) >Identifies an Element of type Field with content field contents  Content of an element can be simple or complex >Numbers, strings, etc., or combinations of other elements  Nested Tags (elements) => multiple hierarchies Generic syntax for languages  Tags not defined, only the language structure  XML instance document contains complex structure of information as linear text – ideal for messages and other interchange XML is a Standard from W3C (based on SGML)  Generic tools to read and write XML in programs  Schema (XSD) for defining rules about Tag names and structure  Style sheets (XSL/T) for transforming XML to some other text form >For example, HTML for display, text script to drive a program, a different (equivalent) XML structure for another context Can use UML to design the logical structure and specify the semantics  Can generate XML schema (XSD)  For example, hyperModel workbench, by David Carlson, www.xmlmodeling.com

12 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview XML Fragment – Metadata for a model Factors associated with flow within Zones (so Destination is the same as Origin). Poisson distribution for observations in first estimate set, based on common rates.

13 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview XML processed to HTML Relationships: Name & TypeOutputInputForm Estimate 1 distribution Stochastic FlowEstimate1 FlowDistribution: Poisson, Rate= Flow Poisson distribution for observations in first estimate set, based on common rates. Estimate 2 distribution Stochastic FlowEstimate2 FlowDistribution: Poisson, Rate= Flow Poisson distribution for observations in second estimate set, based on common rates. Derived Flow FlowLogDerivation: exp(FlowLog) Poisson rates are derived as exponential of (linear) flow function. Derived FlowLog FlowAverage, OriginFlowFactor, DestFlowFactor, FlowWithin Derivation: For : i ∈ OriginZones, j ∈ DestZones If : i ≠ j | FlowAverage + OriginFlowFactor[i] + DestFlowFactor[j] If : i = j | FlowWithin[i] Linear function for log of flow rates. Inter-zone flow is modelled as an average flow adjusted by origin and destination factors (with no interaction). Intra-zone flows are modelled separately Constraint OriginFlowFactorDerivation: ∑ OriginZoneFactor = 0 Origin factors sum to zero, so product of rate components is one. Constraint DestFlowFactorDerivation: ∑ DestZoneFactor = 0 Destination factors sum to zero, so product of rate components is one.

14 25/1/2006RSS/ASC Systems Architecture: Introduction and Overview Programming Languages Is Fortran dead? Not according to Microsoft  Have rediscovered the idea of language-independent intermediate code (runtime – LIR)  Ideal for UML modelling approach System functionality provided at runtime level, so the same for all languages  New compilers only have to do language translation Requires a common programming model  Or at least a subset of the runtime model Allows closely coupled components to be written in different languages  May be the answer for legacy systems


Download ppt "Systems Architecture for Statistical Applications: Introduction and Overview Andrew Westlake Survey & Statistical Computing Wednesday 25 th January 2006."

Similar presentations


Ads by Google