Download presentation
Presentation is loading. Please wait.
1
Distributed Systems 1. Einführung
Simon Razniewski Faculty of Computer Science Free University of Bozen-Bolzano A.Y. 2016/2017
2
Lecturer Lab Assistant
Simon Razniewski Florian Hofer Faculty of Computer Science FUB – POS 1.04 Piazza Domenicani 3 Office hours: Thursday 13:00-14:00 KRDB Research Center Faculty of Computer Science FUB – POS 2.08 Piazza Domenicani 3 Office hours: Thursday 8:30-10:30
4
Me Diplom Informatik from TU Dresden, 2010 PhD from FUB, 2014
Since then researcher in the KRDB group Involved in research on distributed querying and knowledge fusion Taught this course 2015 and 2016 Enjoyed a lot, because DS are ubiquitous, interesting and easy
5
You Survey!
6
Why to study DS? Learn how to make resources accessible
printers, storage, compute power, web pages
7
Why to study DS? (2) Learn to make information accessible
8
Why to study DS? (3) Learn to enable distant communication
, chat, skype
9
Why to study DS? (4) Learn to solve tough problems
10
Goals Understand principles and concepts underlying computer networks and distributed systems Layering, communication, coordination Learn how to design reliable, cooperative (basic) distributed applications Gain practical skills on the development of simple distributed systems (in Java) Improve your software engineering expertise
11
Areas Distributed systems hide the network complexity …
… but in order to understand and develop them, we need competencies in Computer networks Concurrency & Inter-process communication Distributed Systems Computer Networks Concurrency & Inter-process Communication
12
Course Structure 24 lectures (some will be very practical)
12 programming labs in 2 groups 5 of these are with Raspberry PIs 5 Assignments
13
Tentative Schedule Introduction RMI OSI Overview RMI (Lab)
Physical Layer (Guest lecture) Data Link Layer Bobo (Guest lecture) Link Layer (P&P Lab) RPI 2: Distributed Algorithms (Lab) Link Layer Simulation (Lab) P2P Medium Access Control Layer P2P (continued) Network Layer Theory RPI 3: Distributed MD5 Cracker (Lab) Wireshark I (Lab) P2P (Lab) Network Layer on the Internet Cryptography RPI 1: Networking basics (Lab) RPI 4: Distributed MD5 Cracker (Lab) Transport Layer RPI 5: Distributed MD5 Cracker (Lab) Transport Layer/Java Sockets Cryptography (P&P Lab) Java Sockets (Lab) Multiagents+AntMe Application Layer AntMe (Lab) Wireshark II (Lab) Coordination Conceptual Exercises (P&P Lab) AntMe competition Exam preparation
14
Software Used languages/tools/environments:
Java C# (MS Visual Studio 2012 or later) Wireshark Raspberry PIs provided by university Software is available on the virtual machines provided by the faculty
15
Assignments Hamming Codes Java Socket Chat Java RMI Chat
Distributed MD5 Cracker on RPIs Multiagent programming You can work in teams of 2 First three assignments count 5% each, last two 10% each towards the final grade ( total max. 35%). Only assignments that are better than the exam grade are taken into account ( Assignments can only improve grade) Assignments are optional
16
Assignments are optional… Should I do them?
All but one student submitting less than 3 assignments failed All students submitting at least 3 assignments passed
17
Plagiarism Plagiarism 0 points for all assignments + notification of the study course leader What is plagiarism? Any copying of work of others (other students, other people on the web, yourself from last year) Minor copying is only ok, if clearly indicated What is worse than plagiarism? Attempting to conceal plagiarism Specialized software is used….
18
Exam Allowed to bring one handwritten A4 cheatsheet (both sides)
Develop it yourself and magic will happen: The things you put there, you won’t need to look up Dictionaries Pens Nothing else (no calculators, no textbooks, …) Anything covered in the lectures and labs can be on the exam No coding questions But modelling/design questions are possible You do not need to know everything (achieving ~90% of the points is enough to get grade 30) Some exams from previous years are found online
20
Grading Exam: 65% Five assignments: Up to 35%
Only the assignments with a grade higher than the exam mark count Assignments only improve the overall grade To pass the course, the exam grade has to be at least 18 Assignments are valid for all three exam sessions
21
Grading (2) Final grade = (1-AW)*E + AW*AG*((E+100)/(118))
AW = weight of assignments that are at least as good as the exam grade ( ) AG = average grade of the assignments that are at least as good as the exam grade E = exam grade
22
Material The first half of the course is covered in
A.S. Tanenbaum. Computer Networks. Prentice Hall (there are also 5 copies in German in the library) J.F. Kurose, K.W. Ross. Computer Networking – A Top-Down Approach (5th edition). Pearson Education For the second half, the best resource are the slides and online resources Some content regarding RPC, P2P, Cryptography and Coordination is in A.S. Tanenbaum, M. van Steen. Distributed Systems: Principles and Paradigms. Prentice Hall. G. Coulouris, J. Dollimore, T. Kindberg. Distributed Systems: Concepts and Design. Addison-Wesley.
23
Acknowledgment Part of the slides are based on: Slides by Tanenbaum
Previous versions of this course by Nutt, Montali and Pirro
24
Questions?
25
Introduction to Distributed Systems
What is a distributed system? History Paradigms
26
Name some distributed systems
Collect examples
27
What is a Distributed System?
Ask for proposals for definition If you ask three computer scientists, you will get four different opinions RPI power an fragen ob DS ist?
28
Distributed System Some definition: A collection of independent computers that appears to its users as a single coherent system Several computers Independence Heterogeneity of architectures Provides service Transparency (users perceive a single system) Collaboration! Realizing collaboration mechanisms is the goal of distributed systems development Interaction is hidden
29
Alternative Definition
A distributed system is one where a machine I've never heard of can cause my program to fail Leslie Lamport Abstraction: From the outside, a black box providing service interfaces
30
Requirements A collection of independent computers that appears to its users as a single coherent system Make resources/services accessible Hide complexity (Transparency) Openness (wrt. extensions and heterogeneous components) Scalability (wrt. performance)
31
Requirement 1 Make Resources/Services Accessible
Resource: source or supply from which a benefit is produced printers, computers, files, pages, networks, project deliverables, notes, … Services: Computations, data enrichment, .. Issue: security Balance between sharing and privacy Difficult to be achieved in “open” networks
32
Requirement 2 Transparency
Distributed system should present itself to users and applications as a single coherent system Transparency Description Access Hide differences in data representation and how a resource is accessed (multiple interacting OSs) Location Hide where a resource is located (naming strategies) Migration Hide that a resource may move to another location Relocation Migration while resource is in use Replication Hide that a resource is replicated (requires location transparency) Concurrency Hide that a resource may be shared by several competitive users Failure Hide the failure and recovery of a resource Persistence Hide whether a (software) resource is in memory or on disk Whatsapp both on Android and Iphone Concurrency: Printer no, webserver yes Replication and concurrency: classical issues for databases
33
Requirement 2 Transparency
Sometimes transparency is impossible Performance Computation/interaction timings (e.g. Skype) Concurrency issues (e.g., Dropbox) Location-awareness (embedded/ubiquitous systems) Comprehensibility Network awareness clarifies “strange behaviors” Properly tune the degree of transparency Distributed system should present itself to users and applications as a single coherent system
34
Requirement 3 Openness Open DS: offers services according to standard rules describing their syntax and semantics Interoperability: extent by which two systems/components can work together by just knowing each other’s interfaces Portability: to what extent an application can be seamlessly executed in a different environment Reconfiguration/Extensibility: to what extent the DS can be re-organized or accommodate new-parts Skype vs bitcoin/bittorrent – bittorrent can write your own client!
35
Standardization Important for Standardization efforts: Sustainability
Extendibility Security Standardization efforts: ISO = International Standards Organisation ITU-T = International Telecommunication Union IETF = Internet Engineering Task force IEEE = Institute of Electrical and Electronic Engineers W3C = World Wide Web Consortium
36
Importance of Standards
Guarantee large-scale interoperation Guarantee heterogeneity of products Network standards: regulate interaction between remote hardware/software entities, connected through a network Two types of standard De jure (top-down): international committees and organizations De facto (bottom-up): emerge from practice (UNIX, JAVA, TCP/IP, …) Heterogeneity: Android Draw timing
37
Openness Guidelines High componentization
Small replaceable/adaptable components with clear interfaces Separation of logic and implementation Specify only the logic Leave implementation to the developers
38
Sample from the HTTP Protocol Specification
Negotiation An HTTP/1.1 server MAY assume that a HTTP/1.1 client intends to maintain a persistent connection unless a Connection header including the connection-token "close" was sent in the request. If the server chooses to close the connection immediately after sending the response, it SHOULD send a Connection header including the connection-token close. An HTTP/1.1 client MAY expect a connection to remain open, but would decide to keep it open based on whether the response from a server contains a Connection header with the connection- token close. In case the client does not want to maintain a connection for more than that request, it SHOULD send a Connection header including the connection-token close. If either the client or the server sends the close token in the Connection header, that request becomes the last one for the connection. Clients and servers SHOULD NOT assume that a persistent connection is maintained for HTTP versions less than 1.1 unless it is explicitly signaled.
39
Requirement 4 Scalability
Scalable DS: operates effectively and efficiently independently from the number of resources and users Dimensions of scalability (Neumann, 1994) Size: w.r.t. number resources/users Geography: w.r.t. location of resources/users Administration: w.r.t. involved administrative silos
40
Scalability Problems (Bottlenecks)
Concept Example Centralized services A single server for all users (legacy systems, security reasons) Centralized data A single database (avoiding synchronization issues) Centralized algorithms Doing routing based on complete information Decentralized algorithms No machine holds the complete system state Machines take decisions only on local info The algorithm is robust to machine failure Routing means network routing – like IP
41
Scalability Problems (Geography)
From LANs to WANs… LAN: short distances WAN: large distances LAN WAN Synchronous communication (wait for answer) YES NO Broadcasting (send to all) Impossible Centralized solution MAYBE Broadcasting: needed for locating a service – in WAN not possible, need other techniques
42
Scalability Problems (Admin.)
Combination of two administrative domains… Non-technical issues Politics, human relationships… Conflicting policies Payment, management, security Lack of mutual trust Internal code: extensively tested/used External code: unknown Security issues Protection from malicious attacks coming from The new users that have access to the system The foreign code that is now part of the system
43
Scaling Techniques Reduction of the the overall communication
Thin clients vs. fat clients 2. Caching Consistency and synchronization issues 3. Increase of computational power Works only for the size-related aspects 4. Increase of number of computational resources Temporary (cloud solutions) Works only if task can be well parallelized
44
False Myths The network is reliable The network is secure
The network is homogeneous The topology does not change Latency is zero Bandwidth is infinite Transport cost is zero There is one administrator We deal in the lecture what can be done against many of these issues
45
Issues in Distributed Systems
Concurrency Multiple autonomous processes running in parallel Cooperation (resource sharing) Competitiveness (shared access) No global clock Every interacting process has its own local time No synchronization Independent failures Machine crash Network problems Security
46
How about our examples? Make resources/services accessible
Transparency Access Location Migration Relocation Replication Concurrency Failure Openness Interoperability Portability Extensions Scalability Size Geography Administration
47
2. History
48
Computers 1945 – mid 80s Large-sized, expensive computers
Centralized computing (single-processor systems) Monolithic programming OS: monolithic kernel Imitation Game
49
Computers around 1985 Powerful micro-processors
Moore’s law: the number of transistors that can be placed over an integrated circuit doubles every 2 years Computer (local-area) networks Connection among several machines in a building Small amount of data quickly transferable (μs) Micro-kernels Functionalities moved in user space LAN functionalities distributed across the network!
50
Centralized vs. Distributed Computing
51
Computers 1985-now High-speed LANs Wide area networks Internet
100 Mbps – 10 Gbps Wide area networks Connection at the earth level 64 Kbps – 1 Gbps Internet Network of networks Billions of users
52
Distributed Systems vs Computer Networks
Distributed System: transparency/coherence Middleware manages interaction (black box) Applications often run on top of a computer network E.g. WWW, multiplayer online games, … Computer network: collection of autonomous computers interconnected by a single technology Interconnection = ability of exchanging information Usually via message passing Users exposed to the actual machines (white box) Applications support users in interconnecting machines E.g. remote desktop, file transfer, remote printers, … To realize distributed systems we need to understand/program computer networks!
53
The Internet A network of networks built upon TCP/IP
A basis for distributed systems (www, name services, VoIP, file sharing, …)
54
How Many People? http://www.internetlivestats.com/internet-users/
View them all, then click on one to see the colors
55
How Big is the Internet? “Eric Schmidt, the CEO of Google, the world’s largest index of the Internet, estimated the size at roughly 5 million terabytes of data. That’s over 5 billion gigabytes of data, or 5 trillion megabytes. Schmidt further noted that in its seven years of operations, Google has indexed roughly 200 terabytes of that, or .004% of the total size.” They index only first part of large webpages!
56
Network Applications Finance and commerce
eCommerce e.g. Amazon and eBay, PayPal, online banking and trading , Bitcoin The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebook, Google+. Creative industries and entertainment online gaming, music and film at home, user-generated content, e.g. YouTube, Flickr Healthcare health informatics, online patient records, monitoring patients Education e-learning, virtual learning environments; distance learning Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth Science Access to research papers and data sets, collaborative working (SVN, Google Docs, Overleaf), Mapreduce, Environmental management sensor technology to monitor earthquakes, floods or tsunamis
57
Learned today What is a distributed system Requirements
Service provision with distributed black-box execution Requirements Accessibility Transparency Openness Scalability Examples of DS and level of satisfaction of the requirements Next: Protocol Layers
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.