COMP28112 Lecture 2 A few words about parallel computing

Slides:

Advertisements

Similar presentations

Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.

Advertisements

CS542: Topics in Distributed Systems

Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.

Distributed components

CHARACTERIZATION OF DISTRIBUTED SYSTEMS

A Dependable Auction System: Architecture and an Implementation Framework

© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 1: Characterization and Design Goals Dr. Michael R. Lyu Computer.

Distributed Systems Architectures

OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”

EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

.NET Mobile Application Development Introduction to Mobile and Distributed Applications.

.NET Mobile Application Development Remote Procedure Call.

DISTRIBUTED COMPUTING

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 1.

Chapter 2 Architectural Models. Keywords Middleware Interface vs. implementation Client-server models OOP.

Computer System Architectures Computer System Software

CS431 Distributed Systems

1 Distributed Computing Class: BIT 5 & 6 Instructor: Aatif Kamal Chapter 01: Character of Distributed Systems Dated: 06 th Sept 2006.

TRƯỜNG ĐẠI HỌC CÔNG NGHỆ Bộ môn Mạng và Truyền Thông Máy Tính.

Exercises for Chapter 2: System models

Distributed Systems: Concepts and Design Chapter 1 Pages

Architectures of distributed systems Fundamental Models

1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.

Advanced Computer Networks Topic 2: Characterization of Distributed Systems.

OS2- Sem ; R. Jalili Introduction Chapter 1.

Distributed Computing Systems CSCI 4780/6780. Distributed System A distributed system is: A collection of independent computers that appears to its users.

1 MSCS 237 Communication issues. 2 Colouris et al. (2001): Is a system in which hardware or software components located at networked computers communicate.

Distributed Computing Systems CSCI 4780/6780. Geographical Scalability Challenges Synchronous communication –Waiting for a reply does not scale well!!

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Design of Parallel and Distributed.

Chapter 1: Distributed Systems Overview. Objectives To be aware of the characteristics of concurrency, independent failure of components and lack of a.

© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 1: Characterization of Distributed & Mobile Systems Dr. Michael R.

Examples of distributed systems Resource sharing and the web

Distributed Computing Systems CSCI 6900/4900. Review Distributed system –A collection of independent computers that appears to its users as a single coherent.

Today: Distributed Objects and Components 1. Me Greg Paperin MSci Computer Science href= 2.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

CMPT 401 Distributed Systems Concepts And Design.

CSC 480 Software Engineering Lecture 17 Nov 4, 2002.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

Object Interaction: RMI and RPC 1. Overview 2 Distributed applications programming - distributed objects model - RMI, invocation semantics - RPC Products.

30-Sep-16COMP28112 Lecture 21 A few words about parallel computing Challenges/goals with distributed systems: Heterogeneity Openness Security Scalability.

Distributed Systems Architectures Chapter 12. Objectives  To explain the advantages and disadvantages of different distributed systems architectures.

Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.

Chapter 1 Characterization of Distributed Systems

A few words about parallel computing

Chapter 3 Internet Applications and Network Programming

Examples of distributed systems Resource sharing and the web

CSC 480 Software Engineering

Advanced Operating Systems

Distributed Systems Bina Ramamurthy 11/12/2018 From the CDK text.

Chapter 17: Database System Architectures

7.1. CONSISTENCY AND REPLICATION INTRODUCTION

Distributed Systems Bina Ramamurthy 11/30/2018 B.Ramamurthy.

Distributed Systems Bina Ramamurthy 12/2/2018 B.Ramamurthy.

Architectures of distributed systems Fundamental Models

Architectures of distributed systems Fundamental Models

Prof. Leonardo Mostarda University of Camerino

A few words about parallel computing

Architectures of distributed systems

COMPONENTS – WHY? Object-oriented source-level re-use of code requires same source code language. Object-oriented source-level re-use may require understanding.

Introduction To Distributed Systems

A few words about parallel computing

Architectures of distributed systems Fundamental Models

Database System Architectures

Service-Oriented Computing: Semantics, Processes, Agents

Lecture 6: RPC (exercises/questions)

Lecture 7: RPC (exercises/questions)

Distributed System 1.

Distributed Systems and Concurrency: Distributed Systems

Presentation transcript:

COMP28112 Lecture 2 A few words about parallel computing Challenges (and goals) when developing distributed systems 5-Feb-19 COMP28112 Lecture 2

Why would you need 100 (or 1000 for that matter) networked computers Why would you need 100 (or 1000 for that matter) networked computers? What would you do? Common characteristic: chop into pieces! 5-Feb-19 COMP28112 Lecture 2

Increased Performance There are many applications (especially in science) where good response time is needed. How can we increase execution time? By using more computers to work in parallel This is not as trivial as it sounds and there are limits (we’ll come back to this) Parallel computing Examples: Weather prediction; Long-running simulations; … 5-Feb-19 COMP28112 Lecture 2

How do we speed up things? Many scientific operations are inherently parallel: for(i=1;i<100;i++) A[i]=B[i]+C[i] for(j=1;j<100;j++) A[i,j]=B[i,j]+C[i,j] Different machines can operate on different data. Ideally, this may speedup execution by a factor equal to the number of processors (speedup = ts / tp = sequential_time / parallel_time) 5-Feb-19 COMP28112 Lecture 2

Parallel Computing vs Distributed Computing Multiple CPUs in the same computer Distributed Computing: Networked Computers connected by a network. But don’t use the distinction above as a universal rule: in practice, it is not clear where exactly we draw the line between the two. Some of the problems are common, but with a different flavour… 5-Feb-19 COMP28112 Lecture 2

An example: IBM Deep Blue A 30 node parallel computer. In May 1997 the machine won a 6-game match of chess against the then world champion. The machine could evaluate about 200 million positions per second. This allowed to evaluate up to 40 turns in advance. In parallel computing, we need to deal with: Communication (parallel computers may need to send data to each other) Synchronisation (think about the problems with process synchronisation from COMP25111: Operating Systems). 5-Feb-19 COMP28112 Lecture 2

Not all applications can be parallelised! x=3 x=3 y=4 y=x+4 z=180*224 z=y+x w=x+y+z w=x+y+z What can you parallelise? If the proportion of an application that you can parallelise is x (0<x<1), then the maximum speed up you expect from parallel execution is: 1/(1-x) The running time, tp, of a programme running on p CPUs, which when running on one machine runs in time ts, will be equal to: tp=to+ts*(1-x+x/p), where to is the overhead added due to parallelisation (e.g., communication, synchronisation, etc). (Read about Amdahl’s law) 5-Feb-19 COMP28112 Lecture 2

How to parallelise an application Try to identify instructions that can be executed in parallel. These instructions should be independent of each other (the result of one should not depend on the other) Try to identify instructions that are executed on multiple data; different machines operate on different data. These can be schematically represented – see graph! 5-Feb-19 COMP28112 Lecture 2

Challenges / Goals with distributed systems Heterogeneity Openness Security Scalability Failure Handling Concurrency Transparency 5-Feb-19 COMP28112 Lecture 2

Challenge: Heterogeneity Heterogeneity (=the property of consisting of different parts) applies to: Networks, Computers, Operating Systems, Languages, … Data types, such as integers, may be represented differently… Application program interfaces may differ… Middleware: a software layer that provides a programming abstraction to mask the heterogeneity of the underlying platforms (networks, languages, H/W, …) E.g., CORBA, Java RMI It may hide the fact that messages travel over a network in order to send a remote method invocation request… Not necessarily good! 5-Feb-19 COMP28112 Lecture 2

Challenge: Openness Openness is the property of a distributed system such that each subsystem is continually open to interaction with other systems. As a result the key software interfaces of the components of such a system are made publicly available. E.g., Web Services (specified by W3C) to support interoperable machine to machine interaction over a network. Open distributed systems meet the following: Once something is published, it cannot be taken back There is no central arbiter of truth – different subsystems have their own. Unbounded non-determinism: a property of concurrency by which the amount of delay in servicing a request can become unbounded, while still guaranteeing that the request will eventually be serviced (interesting in the development of theories of concurrency). 5-Feb-19 COMP28112 Lecture 2

Challenge: Security Three aspects: Confidentiality (protection against disclosure to unauthorized individuals) Integration (protection against alteration or corruption) Availability (protection against interference with the means to access the resources) Encryption techniques (cryptography) answer some of the challenges. Still: Denial of Service Attacks: A service is bombarded by a large number of pointless requests. Security of Mobile Code. 5-Feb-19 COMP28112 Lecture 2

Challenge: Scalability A system is described scalable if it will remain effective, without disproportional performance loss, when there is an increase in the number of resources, the number of users, or the amount of input data. Scalability in terms of: load, geographical distribution, number of different organisations… Challenges: Control the cost of physical resources and performance loss Prevent software resources running out (32 vs 128-bit Internet addresses) Avoid performance bottlenecks: use caching, replication,… 5-Feb-19 COMP28112 Lecture 2

Challenge: Failure (Fault) Handling Some components may fail while others continue executing. We need to: Detect failures: e.g., checksums may be used to detect corrupted data. Mask failures: e.g., retransmit a message when it failed to arrive. Tolerate failures: do not keep trying to contact a web server if there is no response. Recover from failures: make sure that the state of permanent data can be recovered (or rolled back) after a server has crashed. Build redundancy. Data may be replicated to ensure continuing access. If the chance of failure for a single database is p, then the chance of failure of two databases at the same time is p2, of all three databases p3, and so on… Even if p=0.3, p2=0.09, p3=0.027 5-Feb-19 COMP28112 Lecture 2

Challenge: Concurrency Several clients may attempt to access a shared resource at the same time. E.g., ebay bids Requires proper synchronisation to make sure that data remains consistent. We have seen the principles of this! (refer to COMP25111 and process/thread synchronisation) 5-Feb-19 COMP28112 Lecture 2

Challenge: Transparency Transparency means that any distributed system should hide its distributed nature from its users appearing and functioning as a normal centralised system. There are many types of transparency: Access transparency: resources are accessed in a single, uniform way (regardless of whether they are local or remote). Location transparency: users should not be aware of where a resource is physically located. Concurrency transparency: multiple users may compete for and share a single resource: this should not be apparent to them. Replication transparency: even if a resource is replicated, it should appear to the user as a single resource (without knowledge of the replicas). Failure transparency: always try to hide any faults. …and more: mobility (migration/relocation), performance, scaling, persistence, security (see: http://en.wikipedia.org/wiki/Transparency_(computing) )

In Summary... Several challenges / goals: Reading: Heterogeneity, openness, security, scalability, failure handling, concurrency, transparency. Parallel computing: not really a topic for this module; some of the problems are common, but the priorities are different. Reading: Coulouris, Section 1.4, 1.5 (all of Chapter 1 covered). Tanenbaum, Section 1.2 (all of Chapter 1 covered) Next time: Architectures and Models. 5-Feb-19 COMP28112 Lecture 2