Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Systems Introduction and background Mohan Kumar CSCI652.002 Spring 2014 B&K1.

Similar presentations


Presentation on theme: "Distributed Systems Introduction and background Mohan Kumar CSCI652.002 Spring 2014 B&K1."— Presentation transcript:

1 Distributed Systems Introduction and background Mohan Kumar CSCI652.002 Spring 2014 B&K1

2 Course information CSCI652.002 Spring 2014 B&K2 http://www.cs.rit.edu/~hpb/Lectures/20135/652/index.html

3 Requirements CSCI-352 Operating Systems or equivalent and CSCI-603 Advanced C++ and Program Design or equivalent CSCI652.002 Spring 2014 B&K3

4 Course Content Issues and challenges in distributed systems, including: communication, distributed processes, naming and name services, synchronization, consistency and replication, transactions, fault tolerance and recovery, security, distributed objects, and distributed file systems. CSCI652.002 Spring 2014 B&K4

5 Outcomes Build a solid foundation in distributed systems. Outcomes: – Understand fundamental concepts of distributed computing systems. – Understand modern distributed systems – P2P, mobile, pervasive, sensor etc. – Recognize importance of addressing challenges in modern systems to facilitate distributed computing. – Develop distributed programs on real systems. – More (you tell us at the end of semester) CSCI652.002 Spring 2014 B&K5

6 Attendance Class participation: ACTIVE Participation will prepare students for midterms. Students are expected to interact actively during lectures. All students are expected to solve homework problems and engage in class discussions. CSCI652.002 Spring 2014 B&K6

7 Course material Reference Books Slides by Coulouris et al. – www.cdk4.net Power point slides and whiteboard notes prepared by the professors – Students are expected to read corresponding chapters from textbook prior to each class (please see tentative schedule). – PPT slides prepared by the professors may or may not be available before class. But they will be made available after class. Reference books and articles CSCI652.002 Spring 2014 B&K7

8 Course organization The course will mainly have two main themes. Distributed Algorithms– distributed processes/objects, interprocess communication, remote procedure call, coordination, file systems, clocks and global states, security, concurrency, shared memory, transactions and replication. Systems - – Operating systems, Distributed file systems, Name services, case studies, implementations, P2P, Security, – Plan 9 System CSCI652.002 Spring 2014 B&K8

9 Textbook and References  Textbook  Distributed Systems: Concepts and Design George Coulouris, Jean Dollimore and Tim Kindberg Addison Wesley, 4 th Edition, 5 th Edition - e-version of 5 th edition is available on Kindle  References  Distributed Systems: Principles and Paradigms A.S. Tanenbaum and M. V. Steen, Pearson Publishers,2 nd Edition.  Distributed Operating Systems & Algorithms, R Chow and T. Johnson, Addison-Wesley, 1997. – Related Articles – details will be provided during the course CSCI652.002 Spring 2014 B&K9

10 Grading The structure of quizzes will be discussed in class, at least one week prior to the quiz. – Midterm 1: 15% – Midterm 2: 15% – Final Exam: 30% Group Work (project, presentation, report and class participation): 40%. Group Presentations: Will be scheduled during the last week of semester. Group Work Reports: Due at 9 am May 10, 2014. Each Group will have 3 members; Groups to be formed before February 15. Group Work: Problems will be assigned by February 25 and the expected date of completion is May 10. CSCI652.002 Spring 2014 B&K10

11 What is a distributed system? Concurrent components – Independent – Use message passing to communicate and coordinate Lack of global clock – Asynchronous Independent failures of components – Good for fault-tolerance CSCI652.002 Spring 2014 B&K11

12 “ A distributed system is a collection of independent computers that appears to its users as a single coherent system” Tannenbaum and Van Steen, Distributed Systems, 2007. Application developers can focus on developing applications rather than system issues The distributed system should be – Easy to expand or scale – Available all the time – Accessible uniformly – Fault-tolerant CSCI652.002 Spring 2014 B&K12

13 Layered representation Applications and services Middleware Operating System Communications Network Hardware Mask Heterogeneity Provide abstraction, transparency Uniformity PLATFORM CSCI652.002 Spring 2014 B&K13

14 Motivation Resource sharing – CPU – Disk – Software services – Databases Fault-tolerance – Redundancy – Replication CSCI652.002 Spring 2014 B&K14

15 Challenges Heterogeneity Transparency, openness Security and privacy Scalability Failure handling Concurrency of components CSCI652.002 Spring 2014 B&K15

16 Modern Distributed Systems Mobility – Wireless communications WiFI, Bluetooth, Zigbee, LTE, WiMax, Cellular Ubiquity – Small, but multifunctional devices Cell phones, sensors, RFIDs Large scale – Components – Data – Users CSCI652.002 Spring 2014 B&K16

17 Enablers Computer Technology – Advanced microprocessors – Multi-core architectures – Lower costs (CPU, memory, peripheral devices) High-speed networks – Wired and wireless Applications – Business – Scientific – Everything else …. CSCI652.002 Spring 2014 B&K17

18 Examples of Distributed Systems The Internet Intranets Mobile and Ubiquitous systems Grid Computers Pervasive Systems Sensor Systems P2P Networks CSCI652.002 Spring 2014 B&K18  Airlines  Aircraft  Car  Building  University

19 CSCI652.002 Spring 2014 B&K19 Recent Developments Wireless ad hoc networking Novel algorithms and schemes developed Cooperation in the absence of infrastructure Pervasive computing Context-aware services to users/applications Smart environments Distributed resources Mobile devices possess myriad of resources Opportunistic communications Exchange of packets/bundles Social networks and computing Exploit gregarious nature of humans

20 CSCI652.002 Spring 2014 B&K20 Fading Distinctions Servers and clients Distributed systems, P2P systems Cost and time Producers and consumers of information Users are producers of information as well User with a cell phone camera Service providers and consumers Resources on user devices can be exploited Resourceful and resource-poor entities Servers, desktops, laptops, mobile phones Grid computing Cyber foraging The Challenge is to provide a uniform view

21 What is a distributed system? Concurrent components – Independent – Use message passing to communicate and coordinate Lack of global clock – Asynchronous Independent failures of components – Good for fault-tolerance CSCI652.002 Spring 2014 B&K21

22 Concurrency Program execution Access to resources Message passing Coordination Resource sharing Coordination of concurrently executing programs CSCI652.002 Spring 2014 B&K22

23 No Global Clock Clocks of different components are not synchronized Asynchronous Concurrent programs coordinate their actions by passing messages CSCI652.002 Spring 2014 B&K23

24 Event ordering Lamport’s logical ordering – X sends m1 before Y receives m1 – Y sends m2 before X receives m2 – Because we know replies are sent after receiving messages – That is m2 is a reply to m1 – Y receives m1 before sending m2 CSCI652.002 Spring 2014 B&K24

25 Time services Global time consensus is needed to – Coordinate distributed activities File backup Expiration time of a received message/data – Event related activities When an event occurs or has already occurred How long did it take Which event occurred first CSCI652.002 Spring 2014 B&K25

26 Clocks Physical clock – Approximation of real-time Logical clock – Preserves ordering of events CSCI652.002 Spring 2014 B&K26

27 Independent Failures Distributed systems can fail in multiple ways – CPU/memory of one or more components – Network link/s – Programs might stop executing E.g., input/output, synchronization – System components may get isolated CSCI652.002 Spring 2014 B&K27

28 Resource sharing Hierarchy Processors, Disks – Shared data – Shared webpages Search engine Weather channel Currency converter CSCI652.002 Spring 2014 B&K28

29 Services Manage resources Present functionalities of resources to users and applications – Coherent to applications/users Examples – File service – Mail service – FTP service Client-server architectures – Service may access resources remotely – Clients connect to servers Utilize services CSCI652.002 Spring 2014 B&K29

30 Basic applications Remote login – Keyboard and display interface – Virtual terminal support telnet, rlogin File transfer – File, file structures, file attributes E.g., FTP Messaging – Send and receive – Email, SMTP Browsing – Information retrieval Remote execution – Execute a program on a remote server E.g, MIME – multipurpose Internet mail extension CSCI652.002 Spring 2014 B&K30

31 System models Architectural models – Client-server model – Peer-to-peer model Functional models – Interaction model – Failure model – Security model CSCI652.002 Spring 2014 B&K31

32 Architecture Structural organization of various components – Simple abstraction of components – Two main objectives Placements – Network topology – Data distribution Interrelationships – Patterns of communications – Relationships between data objects – Data access patterns, dependencies CSCI652.002 Spring 2014 B&K32

33 Peer-to Peer and Client/server variations Peer-to-peer – No distinction among peers – Excellent scalability compared to C-S – Resources are utilized in a distributed network, and more efficiently. Minimize bottleneck points Variations – Multiple servers Each server specializes in a providing a particular service – E.g., web servers, DNS server, authentication etc. – Proxy servers Enhance availability Reduce latency – Caches Objects cached to reduce latency – Mobile code and mobile agents Mobile code (e.g., applet) downloaded to client’s site – Local interactions, fast response as there are no communication delays Mobile agents include code and data – Go around execute on different processors CSCI652.002 Spring 2014 B&K33

34 Goals Efficiency – Propagation delays, communications – Overlapped computation/communication – Efficient distributed processing and load sharing Flexibility – User friendly – Ability to evolve and migrate Modularity, scalability, portability, and interoperability Consistency – Predictability and uniformity in system behavior – Integrity in concurrency control, failure handling and failure handling Robustness – Ability to handle exceptional situations and errors Change in topology, lost message, crashed system etc. – Reliability, protection and access control Secure and privacy preserving CSCI652.002 Spring 2014 B&K34

35 Design requirements Performance – Responsiveness Access to shared resources – Communication delays – Server loads, scheduling, wait periods – Control switching – Load balancing – Combined computation/communication scheduling – Scalability – Fault-tolerance CSCI652.002 Spring 2014 B&K35

36 Transparency Ability to hide/mask all system details from users/application developers – System details are irrelevant to users/developers – System details are very relevant to system managers Creation of an illusion of a model that it is supposed to be CSCI652.002 Spring 2014 B&K36 This is in contrast to the meaning of transparency in English – open, visible, see through etc. Applications and services Middleware Operating System Communications Network Hardware Mask Heterogeneity Provide abstraction, transparency Uniformity PLATF ORM

37 Basic Processes Server – Accepts inputs from other processes – Performs a service – Returns outcomes Client – User/application level – Makes requests, receives results The roles of server and client may change with time Peer – All are equal CSCI652.002 Spring 2014 B&K37

38 Processes A process is a program in execution – Sequential A single control block regulates the execution – A control block contains state information – program counters, register contents, stack pointers, communication ports, file descriptors etc. – Process control block (PCB) – Concurrent Simultaneously interacting sequential processes are said to be concurrent Asynchronous Separate address space and PCBs Components may interact through communication/synchronization Process PCB Process PCB Process PCB CSCI652.002 Spring 2014 B&K38

39 Threads A lightweight process – Threads of a process share the same address space, but have their own registers – A thread control block (or TCB) is local to a thread – Typically, Threads have their own PC, SP and register set. Threads share address space, communication ports and file descriptors – Multiple threads are spawned by a process – A PCB is shared among interacting threads – Context switching among threads is lightweight compared to context switching among processes PCB Thread PCB Thread TCB| TCB| TCB PCB TCB| TCB Thread run-time library support Operating System Support CSCI652.002 Spring 2014 B&K39

40 Interaction model Process interactions – C-S, P2P, message passing, shared space, synchronous, asynchronous – Single process/thread, multiple threads Distributed algorithms – Behavior of multiple processes – Includes message transmissions – Each process Has own its PCB and is inaccessible by other processes Likely to be executing on different systems in the network Difficult to coordinate Two significant factors – Communication performance – Maintenance of global state Computer clocks drift Clock drifts differ from one another CSCI652.002 Spring 2014 B&K40 Functional models Interaction model Failure model Security model

41 Performance of communication channels Latency – Time taken for message to arrive at the destination – Delay in accessing the network – Delay (processing times at) due to OS communication services at both ends Bandwidth – Frequency – Interference – Channel sharing Jitter – Variation in times taken to deliver different components of a message CSCI652.002 Spring 2014 B&K41

42 Two variants Synchronous – Process execution time is bounded – Message latency over a channel is bounded – Process’ local clock drift is bounded – Though difficult to build, very useful as a model Time outs Detect failures Asynchronous – Blue bullets (Assumptions) above are NOT true – Most systems are asynchronous CSCI652.002 Spring 2014 B&K42

43 Failure model Omission failures – Processor/process crash – Communication failure/message drops Arbitrary failures – Process setting wrong values in data – Data corruption during transmission Timing failures – Synchronous systems – Real-time systems – Clock, process, channel Masking failures – Replication – Service to mask failures CSCI652.002 Spring 2014 B&K43 Functional models Interaction model Failure model Security model

44 Protecting objects – Who is allowed to access what data Check access rights, verify identity Securing process and interactions – Processes Server, client, peer – Communication channel Copy/alter messages; inject harmful messages Encryption, authentication, time stamping Denial of service Mobile code, mobile agents CSCI652.002 Spring 2014 B&K44 Functional models Interaction model Failure model Security model

45 Event ordering Lamport’s logical ordering – X sends m1 before Y receives m1 – Y sends m2 before X receives m2 – Because we know replies are sent after receiving messages – That is m2 is a reply to m1 – Y receives m1 before sending m2 CSCI652.002 Spring 2014 B&K45

46 Time services Global time consensus is needed to – Coordinate distributed activities File backup Expiration time of a received message/data – Event related activities When an event occurs or has already occurred How long did it take Which event occurred first CSCI652.002 Spring 2014 B&K46

47 Clocks Physical clock – Approximation of real-time Logical clock – Preserves ordering of events CSCI652.002 Spring 2014 B&K47

48 Network Background Slides from Kurose and Ross’s book will be used Please read the book CSCI652.002 Spring 2014 B&K48

49 Networking review Please read up chapter 4 or a networking book I will cover only mobile and wireless networking CSCI652.002 Spring 2014 B&K49

50 Mobile IP CSCI652.002 Spring 2014 B&K50

51 Mobile IP Triangle routing, indirect routing Direct Routing – Home agents – Foreign agents – Registrations HA FA Anchor FA – Care of address – Encapsulation – Agent discovery – Registration TCP – Transmission Control Protocol IP – Internet protocol BS – Base Station MH – Mobile host CH – Correspondent host HA – Home agent FA- Foreign agent CSCI652.002 Spring 2014 B&K51

52 TCP Transport layer protocol Reliable, uses ACKs Congestion control – Adjusts to network conditions Error control – Packets buffered until ACKs received – Buffered packets resent CSCI652.002 Spring 2014 B&K52

53 Desired (in Mobile systems) No disruption of services as the user moves – Changes point of attachment How to ensure? – Autonomous transfer – Minimal delays and losses CSCI652.002 Spring 2014 B&K53

54 Effects of Mobility IP and Mobile IP – IP Packets are routed to their destinations according to IP addresses. IP addresses are associated with a fixed network location. – Mobile IP Packets may be destined to mobile nodes Seamless roaming to applications and users. Shield mobility effects from – applications – higher level protocols TCP/IP was designed for wired networks; But it has survived in the wireless world; well, till now at least!! CSCI652.002 Spring 2014 B&K54

55 Effects of Mobility TCP congestion control mechanism – Acks not received Slow start or other control mechanisms Window size is reduced – Slow start TCP congestion control mechanism in mobile environments When a MH hands-off from one network, it does not receive packets until it registers at another network. In the meanwhile TCP mechanism at the sender assumes the packets have been lost and goes into congestion recovery mode. Congestion window size is reduced and/or packets are retransmitted. Overall effect –performance deterioration. CSCI652.002 Spring 2014 B&K55

56 Encapsulation/Tunneling Messages originating at the CH have The home or original address of the MH The HA encapsulates the message with the address of the FA in the foreign network and forwards the packet to the foreign network The FA peels off the ‘new address’ and forwards the original packet to the MH in the foreign network. This process of appending and peeling off care off addresses is called tunneling or encapsulation. Original Address New Address CSCI652.002 Spring 2014 B&K56

57 Split Connections Split at BS Selective ACK of out of sequence packets Mobile TCP BS MH BS CH The TCP/IP connection is split at the BS. The BS ACKs packets, buffers them and forwards to the MH. Core network CSCI652.002 Spring 2014 B&K57

58 Supervising host One host as a controller in the core network – Keep track of CHs and MHs Supervising Host MH The supervising host (SH)resides in the wired network and keeps track of all the MHs. The supervising host is contacted for all correspondence related to MHs. The SH maintains a directory of MH locations. One can envision a set of distributed SHs catering to groups of MHs. CSCI652.002 Spring 2014 B&K58

59 Snoop protocol Lower layer solution – Processing between TCP and IP at the BS – Packets are snooped (processed) – Snoop module reads packet addresses to determine which packets have not been ACKed. – Facilitates retransmission at the BS Requires packets to be buffered at the BS – Multicast solution MH uses a multicast address as the care- of-address All BSs the MH has been (And will be in) contact with are invited to be members of the multicast group The BS where he MH is residing currently will forward the packets. – Remaining BSs discard the packet TCP IP Snoop MH BS CH CSCI652.002 Spring 2014 B&K59


Download ppt "Distributed Systems Introduction and background Mohan Kumar CSCI652.002 Spring 2014 B&K1."

Similar presentations


Ads by Google