Download presentation
Presentation is loading. Please wait.
1
Socket Programming / Networking
Lecture 11 Socket Programming / Networking
2
What is communication? Communication = exchanging information.
Network communication = exchanging bits between two processes, residing either on the same computer or on different computers connected by a network. General case of enabling two processes to communicate
3
Internet Protocol What is protocol? What is internet protocol?
נוהל, רשימת כללים לביצועה של פעילות מסוימת What is internet protocol? A protocol that details how data is sent and received over the internet. What is Internet? It is a global system of interconnected computer networks. The internet is basically a network of networks. How a machine is identified? Every machine wishes to connect to the Internet, receives an IP address. IP address is a unique identifier for the machine.
5
What about Israel?
6
Bezeq International Line
7
Tamares Internet Line
8
IP Address IPv4: IPv6: 32bit of size Format: XXX.XXX.XXX.XXX
Where XXX is a number from 0 to 255 Each block is 8bit. Allows 232 unique addresses (≈ 4.3billion addresses). Examples: – IP address of ynet – IP address of google – IP address of IPv6: 128bit of size Format: XXXX.XXXX.XXXX.XXXX.XXXX.XXXX.XXXX.XXXX Where each XXXX consist of a hexadecimal value. 0…9, A..F 2001:4860:0000:1001:0000:0000:0000:0068:– IPv6 of ipv6.google.com 2620:0000:1CFE:FACE:B00C:0000:0000:0003 – IPv6 of Allows 2128 unique addresses.
9
IP Addresses Utilization
10
What’s this?
11
Ports What are ports? Why ports? Each port has a unique number:
They are entry/exit points to/from a service residing in a machine Why ports? Allows more than one service to be accessible at the same time on one machine Each port has a unique number: HTTP uses port 80. (as default) FTP uses port 21. (as default) There are 64K (65536) ports to use. (from 1 to 65536) Example: ftp:// :21/
12
Domain Name System DNS: Domain Name System
DNS maps domain names to IPs. Example: Domain name: IP: Why DNS? Machines understand numbers only. Humans find it hard to remember numbers. Thus, DNS was invented. DNS Servers are found at: Each ISP has its own DNS server. Normally two separate servers. (Primary DNS/Secondary DNS) There are 13 root servers in the world.
13
Root DNS Server Map
14
Communication: Sockets
Network Communication: Exchanging bits between two processes, residing either on the same computer or on different computers connected by a network Socket: It is an interface used by applications to connect to another socket Allows exchanging information at the application level
15
Encoding send(&object, sizeof(object), destination),
send sizeof(object) bytes from the memory used to store the state of object to destination There’s a process of converting data from one form to another Examples: ASCII Unicode Both the sending end, and the receiving end must agree on an encoding format Otherwise the data sent cannot be understood by the receiver! If the sending sends strings using by encoding them from UTF-8 format Then the receiver decodes them into UTF-16 format, the result will be garbled!
16
Encoding issues Byte order –processes communicating may reside on different computers and different architectures representation of unsigned int 1: big endian: MSB is stored at the higher address. little endian: MSB appears at the lowest address. Example: integer 1 on a little endian machine, copied byte by byte to a big endian machine, will be turned to Convention: All information through the network is in network byte order (big endian)
17
Encoding issues RTE Types –processes might be running on RTE's (JVM(Java) vs C++(OS)), memory layout of objects in one RTE cannot be interpreted by the other. Compilers: same problem arises, as different compilers may represent different values differently.
18
Encoding Data to Strings
Encoding data into string is a good practice: Easy to debug – ensuring data sent is the same data received Data transmitted can be seen by the developer JSON: JavaScript Object Notation Used to represent complete objects as strings Widely used today in web communication Are strings represented uniformly? no! Unicode Transformation Format: (UTF-8, UTF-16, UTF-32) UNICODE: encoding schemes for correct exchange of strings across architectures. A portable way (independent of OS, RTEs, languages, compilers). Developed so that users have a standardized means of encoding the characters using minimal amount of space
19
UTF-8,16,32 – UNICODE standards.
Most used encoding schemes are UTF-8, UTF-16 and UTF-32. Number after UTF specifies the width (how many bytes) of each character in the encoding scheme. UTF-32: each character takes exactly 4 bytes (2^32 different characters). UTF-8: characters represented by a single byte. Only 2^8 different characters.
20
Encoding Formats: UTF-8, UTF-16, UTF-32
UTF-8 Advantages: Most efficient size wise for Western languages Compatible with ASCII table – while the rest are not Best in recovering errors in data badly transmitted UTF-8 Disadvantages: UTF-8 is a variable-length encoding (1-4 bytes) More complex to process than a fixed-length encoding in UTF-16 (2 bytes or 4 bytes in two categories) and UTF-32 (4 bytes) UTF-16 can be more efficient for representing characters in Eastern languages (Mandarin, Japanese, Korean) where most characters can be represented in one 16 bit word –UTF-16 can be seen as fixed length encoding of two bytes for a character!
21
Encoding Strings: Examples
22
Client-Server Architecture
A process that is accessible over the network. A computer that runs one or more servers is also called a server It does not initiate – it stay passive listening until a request is received. Always “on”. The server prepares a response and returns it to the request sender Client: A process that initiates a connection to a server This process sends requests to servers with certain commands It waits for a response from the server – and once received uses its contents Host: Any computer with a network presence – connected to some network
23
Client Server Architecture
IP Address: host uniquely identified by IP Address - 32 bit number, written in dot notation. Example: Port: host may contain several servers, running side by side, we need some way to distinguish between these servers. A number, Socket: interface a process uses with the RTE related to communication. a connection's endpoint, which can be used by a process for sending or receiving information.
24
Network Communication Models
Bi-directional? (phone call vs. a T.V. broadcast) Point-to-point? (two parties talking or a radio broadcast) Reliable? (everything sent reaches its destination or not?) Session-oriented? (phone call vs. a snail mail)
25
Transmission Protocols
Transmitting data over networks requires protocols Protocols handle these communication tasks: Connecting and disconnecting Sending and receiving packets - the data segments sent over a network Packets structure and format – their header and data structure Reliability – error checking of received packets, and keeping packet order Data Flow Control – handling congestion issues, slow down, speed up Protocol API – defining methods and objects used for transmission Two widely used protocols used over IP: UDP – Universal Datagram Protocol TCP – Transmission Control Protocol
26
UDP: Universal Datagram Protocol
Connectionless - No connection between the client and server sockets. Disadvantages (Unreliable!): No guarantee on packet order No guarantee if packet is received If a packet received is corrupted, then it is discarded Advantages: Faster transmission due to smaller overhead. Great for streaming live media (video, music, VoIP) Not YouTube! But Skype, TwitchTV Cases where lost data is not relevant. Overhead: All the extra data sent by the communication protocol, excluding the real data we wish to send.
28
UDP Client-Server: Communication Flow
29
UDP Line Printer Server
31
Some facts buf – is just a byte buffer of size 1<<16 (2^16 which equals 64KB). Packet – is container for a collection of bytes + their size + an address. InetAddress – is container for an IP address. UDP socket are named DatagramSocket DatagramPacket – UDP packets. Binding – server creates a DatagramSocket, and binds this socket to the requested port. A process is a server if binding a socket.
32
Observations Do forever semantic – a server follows this loop:
Receive a message –(sock.receive(packet)) saves the message into a packet. Calls to sock.receive(packet) blocks until the entire packet is received! Decode the message – Convert to string (packet.getData(), packet.getLength()), assuming the bytes received in the message were a UTF-8 encoded string. Service – here printing the string. Send reply – build a new packet encoding of the string "done", and send encoded string using the sock to the client. sock.send(packet) will block until the packet is sent.
33
A UDP Line Printer Client - in Java
We use the same classes of DatagramSocket and DatagramPacket. the order of operations is inverse in the client: first call send (the client takes the initiative), then call receive (to wait for an answer).
35
Observations A client must know the address and port of the server (implemented here via the command line arguments). client initializes a Datagram socket, but does not bind! Arbitrary port number will be assigned to this socket by the operating system when the socket is used to send a packet. The client builds a new packet with the line to send, and fills it with the UTF-8 encoded message. After sending the datagram, the client waits for an answer (socket.receive()). receive will return any incoming packet. if you wish to listen for packets from a specific host, you should use connect() beforehand.
36
Line Printer Client in C++
Using communication means asking services from the RTE. OS API is not object oriented but functional. Low level - a lot of technical code Use Poco and Boost.
38
TCP: Transmission Control Protocol
Characteristics: Connection oriented protocol A connection must be established between two sockets before transferring data: Bidirectional - data is sent in either way Connection must be closed once done. Advantages: Data sent is guaranteed to be correct Data sent is guaranteed to be sent in full Data sent is guaranteed to be received in same order sent Disadvantages: Latency introduced by waiting for acknowledgements Large overhead makes the transmission slower Overhead: All the extra data sent by the communication protocol, excluding the real data we wish to send.
39
Connecting to Server: TCP Handshake
Called TCP 3-Way Handshake Requires three operations! SYN (Synchronize) SYN-ACK ACK (Acknowledgement) Flow: A sends a SYN packet to B B receives A's SYN B sends a SYN-ACK A receives B's SYN-ACK A sends ACK B receives ACK Only then TCP socket connection is established!
40
TCP - Transmission Control Protocol
Initiating a TCP connection requires the following: server opens a new server socket, binds the server socket to a port and waits for incoming connections. Client opens a new socket and connects this new socket to the server (as with UDP) Client's OS send a request for the server's operating system to initiate a new TCP connection. New TCP connection is uniquely identified by: <server address, server port, client address, client port>. Server gets a new, regular socket (client too) . socket contains an input and output stream.
41
TCP - Transmission Control Protocol
When a TCP server accept(): a new, regular, TCP socket is created. A server socket is basically a socket factory, producing regular TCP sockets. (the new socket generated by the server socket does not use a new port number - OS can distinguish between different TCP streams by checking the 4-tuple we discussed above.)
43
TCP Client-Server: Communication Flow
44
A TCP Line Printer client in Java
A client repeatedly sends lines to the server, which prints these lines. The client does not try to listen for replies from the server.
46
Observations UDP client we used a single message at a time (datagram, packets), TCP use streams. Each Socket has input and output streams. When send or receive from a stream, the call blocks. Wrap the socket output stream using an OutputStreamWriter, (set to use UTF-8 encoding). encode every string written to the OutputStreamWriter using UTF- 8, and send the resulting byte array to the OutputStream As TCP is connection oriented. Everything sent through the stream will arrive in the correct order. We use out.flush() to force sending the string to the server. Otherwise, OutputStreamWriter can buffer data in bigger chunks
47
Try with resources The meaning:
Every AutoCloseable object can be allocated in the try-with-resources section: try(Socket socket = new Socket(serverName, port); BufferedWriter out = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream())); BufferedReader in = new BufferedReader(new InputStreamReader(System.in))){ The meaning: The objects will be closed automatically at the end of the try section. No need to close. If there is and exception during the close(), the exception is handled in the exception catch statement of the “try”. If there is an exception during the try code, and it is caught, the objects will also be closed.
48
A TCP Line Printer Server in Java
50
Observations New type of socket called ServerSocket. This is only used for TCP servers. Server binds the socket with a port. Server process visible to the outside world. Accept() method of the ServerSocket is a blocking call. Returns each time a new connection is established.
51
Observations The accept method returns a new, regular, TCP socket
server then starts a new Thread for each new connection, which will take care of this connection. The server then continues to wait for new connections. Each thread handling a connection is simply reading form the input stream until the connection is closed. thread first decode the bytes arriving into Java character assuming UTF-8. It associates a buffered reader with the character to yield a string that ends every time a line ends.
52
Blocking Sockets Socket operations such as read and write are blocking by default: read on input stream – client/server halt until complete data is read write on the output stream – client/server halt until complete data is sent Drawback: Each client requires a special, dedicated, thread to handle communication with the client. Needing threads – so multiple connections can be handled concurrently Threads reduce scalability of the server Threads are expensive to create and maintain Increasing the number of threads increase the number of context switching Performance deteriorates the more threads are executing Solution: Non-blocking Sockets (called Channels) and the Reactor pattern
53
A TCP Line Printer client in C++: POCO library abstraction over the OS socket
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.