ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

ICS362 – Distributed Systems Dr. Ken Cosh Week 1

Course Description This course provides an introduction to the basic issues in the design and implementation of distributed systems. Topics include communication, processes, naming, synchronisation, consistency and replication, fault tolerance and security.

Course Objectives On completion of this course students will be able to: – 3.1 Discuss key elements to consider when managing Distributed Systems, such as security, fault tolerance, consistency and replication. – 3.2 Compare differences between different Object Based Systems, File Systems, Web Based Systems and Co-ordination Based Systems.

References 1) (Compulsary) Distributed Systems, Principles and Paradigms, 2nd Edition, Andrew S. Tanenbaum & Maarten Van Steen, 2007. 2) Distributed Systems, Concepts and Design, 4th Edition, George Coulouris, Jean Dollimore, Tim Kindberg, 2005.

Topics Introduction Architectures Processes Communication Naming Synchronisation Consistency / Replication Fault Tolerance Security Example Systems

Assessment 1. Quizzes and Presentations-30% 2. Midterm exam-30% 3. Final exam-40%

Course Info. Mon / Wed 12:30-14:00 Room PC319 Office Hours: By Appointment NOTE: Plagiarism = 0.

What is a Distributed System? “A distributed System is a collection of independent computers that appears to it users as a single coherent system.” (Tanenbaum) “Hardware of Software components located at networked computers communicate and coordinate their actions only by passing messages” (Coulouris)

Key Features Components that are autonomous Users think they are dealing with a single system This requires some collaboration Note: The challenges involved are independent of the type of computers used.

Characteristics of DS How it works is hidden from user. Interaction is consistent & uniform Scalability Continuously available, even if some parts are out of order

Layered Architecture Commonly implemented through layers & middleware Application A Local OS 1Local OS 2Local OS 3 Application C Local OS 4 Distributed System Layer (Middleware) Application B Network

Goals Make Resources Available Hide the fact that resources are distributed – Distribution Transparency Be Open Be Scalable

Make Resources Available E.g. Printers, storage facilities, data, files, webpages, networks etc. – For economic reasons – For collaboration reasons – To create virtual organisations This produces challenges – Security – Privacy

Distribution Transparency An important goal of distributed systems is to hide the fact that processes / resources are physically distributed Enabling users to use the system without worrying about where the resources are. Access Transparency Location Transparency Migration Transparency Relocation Transparency Replication Transparency Concurrency Transparency Failure Transparency

Access Transparency Different Resources may represent data in different formats, but this shouldn’t be an issue for the user. – A user on an Intel workstation sending data to a Sun SPARC machine, shouldn’t be concerned that Intel orders its bytes by little endian format (high order bytes first) while SPARC uses big endian format (low order bytes first). Different file naming formats should also not be of concern to the user. ‘/’ or ‘\’.

Location Transparency Location Transparency refers to the physical position of a resource, which should be hidden from the user. This is normally achieved through naming, where normally only logical names are used; – http://cis.payap.ac.th/index.php http://cis.payap.ac.th/index.php Where is it (physically)? Has it always been there?

Migration / Relocation Transparency In the previous web address, you have no idea whether index.html has always been on the cis.payap.ac.th server, or when it might have moved there. If resources can be moved without affecting the way the resource is accessed then migration transparency is provided. If that movement occurs while the resource is being accessed, then relocation transparency is provided. Consider moving around using a wireless laptop.

Replication Transparency The efficiency of distributed systems can be improved greatly by locating replicas (copies) of a resources physically closer to a user. Replication transparency enables the system to do this, without the user knowing they are using a replica.

Concurrency Transparency A goal of distributed systems is often sharing of resources between users. These users may wish to access or even update the same data at the same time (concurrently). An important challenge when designing distributed systems is how to deal with concurrent accesses. – How to maintain consistency when different users use the same resource in different ways.

Failure Consistency “You know you have one when the crash of a computer you’ve never heard of stops you from getting any work done!” Failure Consistency tries to mask failures such as this. It is difficult to identify between a resource that has failed and a resource which is performing badly (slowly). – Consider opening a webpage - is it dead or painfully slow, how long should the browser wait?

Complete Transparency? Complete Transparency isn’t always completely necessary. – E.g. daily newspaper arriving at 7am regardless of location in the world. Nor is it always possible. – Physics behind signal transmission.

Openness A further goal of distributed systems is openness - that any resource conforms to a set of open standards. Doing so enables different parts of the system to make use of required services. This is normally achieved through modules which offer services which are specified through interfaces, using a standard IDL (Interface Definition Language). The IDL specifies the syntax of the resource, harder to specify is the semantics of what the services actually do.

Openness Distributed Systems should be complete and neutral, and in doing so should be interoperable and portable; – Interoperability refers to how well 2 different systems (possibly from different manufacturers) can co-exist making use of each others services. – Portability refers to whether an application written for system A can be used by system B.

Openness Another feature of open systems is flexibility. Systems should be flexible to enable users to specialise their interactions without affecting other users or components. Flexibility is often achieved through designing systems as a collection of small, replaceable or adaptable components.

Scalability A further goal of Distributed Systems is that they should be scalable - that is that they can grow; – Scalable by size; more users or resources can be added to the system. – Scalable by location; resources and users may be physically distant. – Scalable by administration; system can be easily manageable as it grows.

Scalability One problem often encountered when dealing with scalability is dealing with centralisation. – Centralised services – Centralised data – Centralised algorithms Imagine how the internet would work if there was only one single DNS table, and every address resolution request had to be directed through that computer.

Scalability Another problem affecting scalability concerns whether synchronous communication is actually possible. – Many existing systems were designed for synchronous communication. The laws of physics (including the speed of light), limits the speed of communication between physically distant resources. – Leaving a ‘client’ blocked until a reply is sent back.

Scalability & Administration What happens when a system needs to scale across multiple, independent adminstrative domains? – Conflicting policies Resource Usage Management Security

Solving Scalability (briefly & currently) Hiding Communication Latencies – Essentially asynchronous communication. Not waiting for a reply, instead creating a special handler (thread) to complete previous requests. Distribution – Splitting a component into smaller parts – e.g. DNS, splits.com,.th,.edu etc. Replication – For example caching. A copy of the data closer to the request.

Replication & Scalability Replication can have a downside effect on Scalability – Consistency Problems – How big a problem is this?

Complexity Clearly designing a DS is a complex task. Some common false assumptions adding to complexity: – The network is reliable – The network is secure – The network is homogenous – The topology doesn’t change – Latency is zero – Bandwidth is infinite – Transport cost is zero – There is one administrator

Examples of DS Distributed Computing Systems – Cluster Computing – Grid Computing Distributed Information Systems – Transaction Processing Systems – Enterprise Application Integration Distributed Pervasive Systems

Distributed Computing Systems For high performance computing tasks When price/performance ration of PCs and Workstations improved, it was financially & technically attractive to build supercomputers by hooking up a collection of simple computers on a high speed network.

Cluster Computing Homogeneous hardware Master node handles allocation of tasks and user interface E.g. Beowulf Linux clusters

Grid Computing Heterogeneous Hardware – No assumptions about hardware, OS, Networks, Administrative domains, security policies Resources from different organisations are brought together to allow collaboration – essentially realising a virtual organisation. – Towards Service Oriented Architectures

Distributed Information Systems When Business Information Systems moved into a networked environment. – Sharing data between functional units – Sharing functionality both internally and externally

Transaction Processing Systems Consider a transaction as an operation on a database. – Handled through Remote Procedure Calls (RPCs) Each transaction should have 4 characteristics (ACID) – Atomic – Consistent – Isolated – Durable

ACID Atomic – Either the whole transaction happens, or none of it. Consistent – Certain invariants must remain true – e.g. the total amount of money in a bank must remain the same before an after internal transfers (even if momentarily during the transaction this isn’t true). Isolated – Two concurrently running transactions should not interfere with each other. Durable – One a transaction commits, there is no going back.

Enterprise Application Integration Applications are built on top of databases – separated from the databases. – So these applications may need to communicate with each other. Which leads to different communication middleware – RPC – Remote Method Invocations (RMI)

Distributed Pervasive Systems Thus far systems have been ‘stable’, i.e. relatively permanent fixed nodes with high quality connections. – Pervasive systems integrate mobile / embedded computing devices. Small, battery-powered, mobile, wirelessly connected nodes which blend into their environment. – Nodes should be able to discover local services and react accordingly E.g. Home Systems, Electronic Health Care Systems, Sensor Networks

ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Similar presentations

Presentation on theme: "ICS362 – Distributed Systems Dr. Ken Cosh Week 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Similar presentations

Presentation on theme: "ICS362 – Distributed Systems Dr. Ken Cosh Week 1."— Presentation transcript:

Similar presentations

About project

Feedback