The Internet: A Distributed System

The Internet: A Distributed System

Copyright © 2002 Nik Clayton
All rights reserved. Redistribution and use, with or without modification, are permitted provided that the following condition is met: • Redistributions of this presentation must retain the above copyright notice, this list of conditions and the following disclaimer. THIS PRESENTATION IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Obligatory biographical bit
Used to be Now One of five running mail for Citigroup 11m msgs/week, 850MB/day Editor, “FreeBSD Handbook”

Looking at… How the Internet works
How the Domain Name System (DNS) works on top of this How the Simple Mail Transport Protocol uses both of these to shuffle around the place I expect all of you to already know some of this -- probably some different parts though, and at varying levels of detail. Hopefully some of the examples and analogies that are used will make the bits you’re not too sure about clearer. My apologies if some of you know all of this. But if it’s any consolation, it makes you much more employable when you graduate. Try not to snore too loudly. This is, by necessity, condensed.

So, how does the Internet work?
Three key protocols involved: IP: Internet Protocol UDP: User Datagram Protocol TCP: Transmission Control Protocol, often written TCP/IP IP is lowest layer, UDP and TCP sit on top of it. Not going to look at the physical layer (ethernet, etc) Not going to look at IPv6

Internet and the OSI 7 layer model
7 Application TELNET RFC 854 FTP RFC 959 SMTP RFC 821 SNMP RFC 1098 DNS RFC 1034 6 Presentation 5 Session 4 Transport TCP RFC 793 UDP RFC 768 3 Network ARP RFC 826 RARP RFC 903 ICMP RFC 792 BOOTP RFC 951 IP RFC 791 2 Link 802.2 802.3 802.5 Other Medium Access Protocols 1 Physical I think it’s required that every presentation about networking includes a copy of the 7 layer model somewhere. What are the RFC numbers? The Internet’s standards are described in RFC documents. Each document describes a protocol or some other piece of technology that’s required for interoperation between Internet connected hosts, and each document is assigned a number. RFC means “Request for Comments”, and harkens back to the days when all this stuff was still being thrashed out, and instead of creating ‘standards’ people proposed new technologies in these documents and waited for suggestions and improvements to come back.

The 7 Layer Burrito 7. Sour cream 6. Cheese 5. Guacamole 4. Tomato
3. Lettuce 2. Seasoned rice 1. Refried beans Likely to be much more useful in the long run.

A Networking Analogy Two office blocks, each contains a number of different companies Each company has one or more phone numbers (so there are several phone numbers for the office block) Each phone number has a few hundred extensions To call anyone, you need their company phone number, and their extension 4 numbers identify any call -- source phone number, source extension, destination phone number, destination extension

A Networking Analogy (cont.)
Imagine if everybody agreed on certain standard phone extenions. #25 gets you to the mail room #80 is the marketing department #123 calls the speaking clock That’s almost how the Internet works Not everyone will get the ‘marketing department’ joke

In an IP network… You have a host (an office building)
Each host has one or more network interfaces (companies within the building) Each interface has one or more IP addresses attached to it (phone numbers) Each interface has ports (extensions) Connections are made from a port on an IP address to another port on an IP address 4 numbers identify a connection on the Internet -- source IP address, source port, destination IP address, destination port

Packet switching Internet is a packet switched network
Data is split into packets Each packet has a source IP/port, and a destination IP/port, as well as other meta-information Packets may not arrive in the same order as sent Packets may not even arrive at all Suppose you need to send a five page letter home, but all you’ve got is a stack of postcards. So you put a bit of the letter on each postcard, number them (so they can be put back in the right order when they’re received), address them, and put your name on them. Then you put them in the post, and hope that they all arrive. That’s, broadly, how a packet switched network works. Without all the tedious business of having to think of something interesting to say about your holiday.

IP Address: A definition
32 bit number So there are 232 = 4,294,967,296 of them Normally written as 4 * 4 octet values, e.g., (dotted quad notation) Are assigned by the network people, who arranged a block of addresses for the company, who were given them by your ISP, who was allocated them by their regional IP authority, who were assigned a regional block by the Internic. Why octets, and not bytes? A byte’s 8 bits, right? Well, not always. At least, not when the documents that specify how all this stuff works were written. At the time there were machine architectures around where a byte was more or less than 8 bits. So to be absolutely sure there was no potential for confusion, the documents use the word “octet” to mean 8 bits, rather than “byte”

So, tell me what ports are
Like a telephone extension Each IP address has = ports A server listens on an IP address:port pair for incoming connections A client is typically allocated a port at random for outgoing connections, and specifies the destination port it wants to connect to Some services (mail, web, etc) have “well known ports” assigned that servers are expected to listen on (25, 80, etc)

Networks are groups of IP addresses
IP addresses are grouped into collections, called networks Network membership is determined by the netmask Netmask splits the IP address in to two portions; the host portion, and the network portion Two hosts are in the same network if the network portions of their IP addresses are identical

How netmasks work is really and is really

How netmasks work (cont.)
Netmask is another 32 bit binary number It is binary-ANDed with the IP address All bits still on after this form the network portion of the IP address All bits left off are the host portion

IP: Netmask: AND = So this is the .1 host in the network

Netmask doesn’t have to be a continuous string of 1s, then continuous 0s That would be bloody stupid though In practice, netmasks are all 1s, then all 0s

Leads to another common notation for netmasks, /n /24 means 24 x 1, then all 0 Same as /16 would 16 x 1, then all 0 Same as

Are these two hosts on the same network? /24 /24 No. The first is on the net, the second is on the net What about these? /16 /16 Yes, they’re both on the net

Netmasks do not need to be on an octet boundary /26 = =

The Network Addresses Network address is used to indicate the whole network No host can be given the network address Consists of the network portion as normal, with the host portion set to all zero /24, the network address is /26 defines four networks = = = =

The Broadcast Addresses
Broadcast address is used to send to all hosts on the network No host can be given the broadcast address Consists of the network portion as normal, with the host portion set to all ones /24, the broadcast address is /26 defines four networks and broadcast addresses = = = =

Shrinking address space
/24 has 256 host addresses available .0 through to .255 Lose .0, reserved for network Lose .255, reserved for broadcast Leaves you with ( ) = 254 available addresses for hosts

Shrinking address space (cont.)
/25 creates two networks .0 network Network address is .0 Broadcast address is .127 Host addresses are .1 through to .126 (126 addresses) .128 network Network address is .128 Broadcast address is .255 Host addresses are .129 through to .254 (126 addresses) Only (126 * 2) = 252 available host addresses now

Smaller subnets, fewer hosts
/26 network has four networks Each network reserves 2 addresses So there are 4 * 2 = 8 addreses reserved = 248 host addresses available And so on

Routing Hosts on the same network can contact each other directly
E.g., /24 wants to talk to /24. It puts a packet on the wire with a destination address of , and receives it It’s like magic, you don’t need to know how this bit works, it just does If you become a network administrator, you will learn, in long, tedious detail, how this magic works

Routing (cont.) Hosts on two different networks can’t talk directly, they need a router to route the packets between them A router is a device with at least 2 network interfaces present on 2 or more different networks Hosts send packets for other networks to the router Router looks at the destination address information in the packet, and works out where to send it

Routing (cont.) Each Internet host has to maintain a routing table
The routing table details how packets get from a to b The routing table only contains information about the networks the host is directly connected to

Routing (cont.) /24 /24 /24 Here’s a router connected to three networks. Network 1 is /24. The router has the .1 address on this network Network 2 is /24. The router has the .1 address on this network too Network 3 is the Internet, and the router’s got the address A.B.C.D on that network The router is configured to route packets between all three networks /24 Internet /24

Routing (cont.) Here’s the routing table for the workstations on the /24 network If it’s on the local network then we know we can reach it directly Otherwise send it on to the router, and hope that it knows how to deal with it Destination Gateway /24 Local interface Default

Routing (cont.) Here’s the routing table for the router Destination
Gateway /24 Interface 1 /24 Interface 2 Default Interface 3

Routing (cont.) This is very scalable
No host needs to know the complete route to the destination, or the Internet’s topology They just need to know the IP address of the nearest router The nearest router hands it off to the next nearest router, and so on

User Datagram Protocol (UDP)
Runs on top of IP Connectionless, just send data No guarantee packets will be delivered in order, the applications must deal with this No guarantee packets will even arrive, applications must resend data as necessary A bit like the Post Office But very low overhead

Transmission Control Protocol (TCP)
Runs on top of IP Connection oriented (open/send/close) Network stack ensures Packets are delivered to the application in the correct order Missing packets are automatically resent Has more overhead than UDP, particularly on the intial connection (three way handshake) Handles network congestion well Need to demo the three way handshake with Simon How does it handle network congestion? Starts by requiring that each packet must be acknowledged by the remote end. As more packets get through successfully on the first attempt it stops requiring that each packet be acknowledged -- instead, a single acknowledgment can be sent for multiple packets sent. If lots of packets need to be retransmitted TCP assumes the network is congested, and starts leaving more of a time gap between packets. If everyone else on the network is also using TCP then the effect is that everyone backs down a bit, allowing more data to get through. Roughly. If you’re interested in this, check the literature for references to things like TCP slow start, TCP exponential backoff, and the Nagle algorithm.

Internet summary Hosts have interfaces Interfaces have IP addresses
IP addresses subdivided in to the network portion and the host portion by the netmask Subdividing networks consumes available IP addresses (for network and broadcast address) Hosts on the same network can talk to one another directly Hosts on different networks need to know the address of the correct router to use

Internet Summary (cont.)
Data sent using either UDP or TCP UDP is faster, but the application has to do more book keeping TCP starts slower, but the application has to do less work

IP Design Good Points Very scalable Easy to understand, simple rules
Does not enforce specific policy Networks can be any size Does not require particular cabling standard Hardware and OS agnostic Open

IP Design Bad Points Large networks send a lot of meta data around
Hosts announcing themselves Basic IP design is not secure Easy to spoof the source address on a packet Leads to denial of service attacks Malicious router can sniff traffic, or replace data Security in layers 5, 6, and 7 (SSL, SSH, etc)

Domain Name System (DNS)

The Definitive Reference
DNS and BIND, Paul Albitz & Cricket Liu Everything you ever wanted to know about the DNS Can’t recommend this book highly enough

IP Addresses are a pain Working with IP addresses is
Cumbersome Error prone Hard to remember We prefer to name things where possible Which is why we have domain names

Fully Qualified Domain Names
FQDN is two or more names, separated by dots L/R, the first part is the host name The rest is the domain name IP addresses are mapped to FQDNs FQDNs are mapped back to IP addresses How?

One way: The hosts file 10.10.1.1 gateway.example.com
me.example.com another.example.com . . .

This does not scale (!)

So the DNS was invented A hierarchical name space, read from right to left me.example.com (FQDN) is . <- The root .com <- Top level domain .com.example <- Sub-domain .com.example.me <- FQDN Converting a hostname to an IP address is called “resolving” the address “zone” and “domain” are almost interchangable terms A zone contains all the information for a domain that hasn’t been delegated to another zone. For example, the .uk zone doesn’t contain all the information in the .uk domain (which would include all the .co.uk, .ac.uk domains, and so forth). Instead, the zone contains pointers to the nameservers that provide information for the .co.uk and .ac.uk zones. In this way, the information about those zones is delegated from the .uk zone to the .co.uk and .ac.uk zones. However, if the subdomains are not delegated to other nameservers, then a nameserver for the .uk zone would contain all the information for the subdomains as well. In practice, the distinction is so fine as to be immaterial.

How the DNS is used 3 types of host
DNS servers know how addresses and names map to one another for one or more domains DNS caches, given a domain, know how to find out which DNS server knows about that domain, and query it for info DNS clients (resolvers) know how to talk to caches DNS clients contact their nearest cache when they need to resolve an address. The cache works out which DNS server will have this information, and makes the queries Confusingly, a machine can be a DNS server, a cache, and a client all at the same time. Also, there are two types of DNS server for each zone, 1 primary, and 0-n secondaries. But more on them later.

The root nameservers 12 (or so) machines, scattered around the world, that know the nameservers immediately below them Every DNS server in the world needs to know the IP addresses of the root nameservers That’s the only bit of static configuration required Everything else is looked up as necessary Which is pretty cool

DNS Hierarchy

Primary and Secondary DNS
Each domain has exactly one primary (master) DNS server, and 0 to ‘n’ secondary (slave) servers To a client, there is no distinction between the two DNS information is updated on the primary DNS server Secondary servers periodically check for updates, and copy changes over as necessary When information is updated on the primary server the zone’s “serial number” is also updated. This is just a number that must increase every time a change is made. A common tactic is to generate this number by taking the year, month, and day, and then appending the generation number. So is the serial number for a zone that has been edited exactly once on 11/10/ is the serial number for a zone that has been edited 14 times on 2/11/2002. Secondary servers know if the data has been updated because the serial number for their copy of the data will not match the serial number on the primary server’s copy of the data. They only do a zone transfer if the serial numbers do not match.

DNS in action dns.example.com is the local DNS cache
me.example.com is a host that uses the DNS server You are a user running applications on me.example.com You type ‘ in your web browser What happens? Need 6 volunteers 1 x user 1 x me.example.com 1 x dns.example.com 1 x root nameserver 1 x .org servers 1 x freebsd.org server

DNS in action (cont.) First, me.example.com checks to see if it knows the IP address of It doesn’t So it sends a DNS query to dns.example.com This query says “Please give me the A record for the FQDN

DNS in action (cont.) dns.example.com knows nothing about So it asks one of the root name servers They don’t know either, but they say “Go talk to the .org name servers, here’s their IP addresses” So dns.example.com goes and asks the .org name servers

DNS in action (cont.) They say “We don’t know, but we do know that ns.freebsd.org is the nameserver that’s authoritive for *.freebsd.org, here’s its address, go ask it” So dns.example.com says to ns.freebsd.org “Please give me the A record for ns.freebsd.org says “Sure, it’s ”

DNS in action (cont.) dns.example.com caches this information (so if it’s asked again it doesn’t need to redo all the above), and sends the info back to me.example.com All this happens in a few seconds This is what your browser is doing when it says something like “Resolving hostname”

Other types of DNS record
That example used “A” records They map FQDNs back to IP addresses Called a “Forward” lookup Not the only type of records in the DNS PTR records map IP addresses to FQDNs Called a “Reverse” lookup NS records list the domain’s name servers MX records are used for mail routing SOA record is the ‘Start of Authority’

SOA Record Every zone has one SOA record
Describes characteristics for the zone Serial number, which is incremented every time the data changes Time-to-live, which says how long data should be cached for address of DNS info maintainer Note that we use ‘zone’ here, not domain. This is because a zone may contain more than one domain (e.g., the brunel.ac.uk zone may include all the possible sub-domains of brunel.ac.uk, or there may be additional zones for the subdomains -- each zone will have its own SOA record)

Example of a DNS Zone File
$ORIGIN brunel.ac.uk. brunel.ac.uk. IN SOA sirius.brunel.ac.uk. hostmaster.brunel.ac.uk. ( ; Serial number ; Refresh after 2hrs 13min ; Retry after 2hrs ; Expire after 1wk ; Minimum TTL of 6hrs ) IN NS sirius.brunel.ac.uk. IN NS ns3.ja.net. IN MX 5 nemesis.brunel.ac.uk. IN MX 4 eros.brunel.ac.uk. s70n IN A s249n IN A s249n IN A … … … Serial number we’ve already talked about Refresh interval tells slaves how frequently to check back with the primary Retry tells the slave how frequently to try if the initial connection fails Expire tells the slave “If you haven’t talked to the master in this interval then your data is old, and stop answering for this domain” TTL (“Time to live”) is how long this data can be cached for

IP Characteristics of DNS
DNS servers listen on port 53 Generally uses UDP Very short communication lifespan TCP overhead is too high Protocol is simple and robust Didn’t get an answer? Just send the query again May use TCP where appropriate Zone transfers between primary and secondary servers

Smart things about DNS Simple mechanism for synchronising primary and secondary servers Distributes data throughout the network, no real single point of failure for the Internet With the exception of the root nameservers DDoS Attacks DDoS attacks? Distributed Denial of Service. It’s what happens when you get a few million hosts all flooding a root nameserver with queries at the same time? How do you get a few million hosts all doing this? First, pick an OS that’s relatively insecure, but often connected directly to the Internet. Such as all those Windows boxes that are now connected directly to the ‘net using ADSL or cable broadband connections. Now, write an exploit for these hosts that installs itself and lies dormant, until they all wake up at the same time and simultaneously flood the root nameservers with requests. The most recent one of these was a few weeks ago. Took 4 of the roots out of commission for a brief period of time. Most people never noticed, since you need to lose about 8 roots before things start going to hell in a handbasket.

Bad things about DNS Not secure, you have to trust your DNS server
Always do a forward lookup after a reverse lookup DNS server is a single point of failure for a network’s presence on the Internet So make sure that multiple secondary servers exist On different, geographically disparate networks Forward lookup after a reverse lookup? Suppose that you receive a connection from a host with the IP address A.B.C.D., and you look that up in the DNS and get told that A.B.C.D corresponds to However, you can’t trust this, because the DNS server may have been told to lie. So then you have to do a DNS lookup for and confirm that its IP address matches A.B.C.D. If it doesn’t, you know that someone is trying to fool you.

Bad things about DNS (cont.)
Difficult to do updates ‘on demand’ There are enhancements that try to address this But they’re not widely deployed Commercial interests

Simple Mail Transport Protocol
(SMTP)

SpaM Transport Protocol
What it sometimes feels like

A word from our sponsor…
Wed 13th to 16th November 2003 Compass Theatre, Ickenham £5.00, £6.50 or £7.50 I’m in it as myself. “Nail it to the counter Lord Fergason and damn the cheesmongers!”

An e-mail message consists of…
Envelope Contains addressing information Discarded once the message is successfully delivered Header Contains 1-n “name: value” fields From:, To:, CC:, BCC:, Subject:, Date:, Received:, X-Foo:, X-Bar:, etc… Body Unstructured text of the actual message

Sample SMTP conversation
# telnet eros.brunel.ac.uk ************ HELO ngo.dnsalias.org 250 eros.brunel.ac.uk OK MAIL FROM: OK RCPT TO: Recipient OK DATA 354 Enter Mail, end by a line with only ‘.’ From: (Nik Clayton) To: (Simon Taylor) Subject: Slides for lecture Sorry mate, no chance I’ll have the slides ready in time, we’ll need to fake something. But keep it to yourself, I don’t think they’ll notice. Nik Submitted & queued (msg ) QUIT eros.brunel.ac.uk says goodbye to ngo.dnsalias.org “MAIL FROM:” and “RCPT TO:’ lines are the envelope First lines after “DATA” are the message headers Then a blank line Then the body of the message, terminated by a line with a sole dot.

SMTP Highlights Protocol is entirely plain text
Easy to debug Easy to test by hand Easy to script Protocol is relatively simple Easy to write code for (Microsoft excepted) Protocol is unambiguous All information is contained in the status codes. The explanatory text is useful but ignored by implementations

SMTP Highlights (cont.)
Protocol is consistent 2xx codes indicate success 3xx codes indicate ‘send more data’ 4xx codes indicate temporary failures 5xx codes indicate permanent failures The ‘xx’s provide further delineation SMTP implementations are supposed to be paranoid

A real SMTP failure We had an application that was a buggy SMTP server
Sometimes it failed to send back a valid SMTP response after generating a bounce message The client didn’t know whether or not the message was delivered, temp. failed, or perm. failed So it tried, tens of times a second, to resend the message This generated thousands of bounce messages very quickly

The Envelope and Bcc: From: To: Bcc: . . . MAIL FROM: RCPT TO: Recipient OK RCPT TO: Recipient OK DATA From: (Nik Clayton) To: (Simon Taylor) Here’s a sample message sent by me, to Simon. I’ve also BCC’d it to someone else. It’s the nature of the BCC that Simon is not supposed to know that I’ve done this, so there’s no ‘BCC’ header to show who the message was BCC’d to. So there are two recpients for the message, shown by the two RCPT lines in the envelope portion of the SMTP transaction.

Sample Received: Lines
Received: from localhost [ ]) by crf-consulting.co.uk (8.12.3/8.12.3) with ESMTP id g9GFo4Tk093919 for Wed, 16 Oct :50: (BST) (envelope-from Received: from ngo.org.uk [ ] by localhost with POP3 (fetchmail ) for (single-drop); Wed, 16 Oct :50: (BST) Received: from nemesis.brunel.ac.uk (nemesis.brunel.ac.uk [ ]) by ngo.org.uk (8.9.3/8.9.3) with ESMTP id RAA07600 for Wed, 16 Oct :01: (BST) Received: from csstsjt (actually s76n96.brunel.ac.uk) by nemesis.brunel.ac.uk with SMTP-BRUNEL (PP) with ESMTP; Wed, 16 Oct :47: One received line is added to the top of the message my each MTA the message goes through. They’re very useful in diagnosing problems, or tracking down delays, since the include information about which system the message went through, and when it went through it. Here are some received: lines from a message I grabbed from my archives.

Re-ordered Received: lines
Received: from csstsjt (actually s76n96.brunel.ac.uk) by nemesis.brunel.ac.uk with SMTP-BRUNEL (PP) with ESMTP; Wed, 16 Oct :47: Received: from nemesis.brunel.ac.uk (nemesis.brunel.ac.uk [ ]) by ngo.org.uk (8.9.3/8.9.3) with ESMTP id RAA07600 for Wed, 16 Oct :01: (BST) Received: from ngo.org.uk [ ] by localhost with POP3 (fetchmail ) for (single-drop); Wed, 16 Oct :50: (BST) Received: from localhost [ ]) by crf-consulting.co.uk (8.12.3/8.12.3) with ESMTP id g9GFo4Tk093919 for Wed, 16 Oct :50: (BST) (envelope-from Because the lines are added to the start of the headers by each MTA, you need to read them in reverse order. This is the same set of received: lines, but sorted in order.

Re-ordered Received: lines
Received: from csstsjt (actually s76n96.brunel.ac.uk) by nemesis.brunel.ac.uk with SMTP-BRUNEL (PP) with ESMTP; Wed, 16 Oct :47: Received: from nemesis.brunel.ac.uk (nemesis.brunel.ac.uk [ ]) by ngo.org.uk (8.9.3/8.9.3) with ESMTP id RAA07600 for Wed, 16 Oct :01: (BST) Received: from ngo.org.uk [ ] by localhost with POP3 (fetchmail ) for (single-drop); Wed, 16 Oct :50: (BST) Received: from localhost [ ]) by crf-consulting.co.uk (8.12.3/8.12.3) with ESMTP id g9GFo4Tk093919 for Wed, 16 Oct :50: (BST) (envelope-from And here’s some of the useful information they provide highlighted. Notice the time discrepancy between lines 1, 2, 3. Looks like the host in the second line (ngo.org.uk) has the time set incorrectly. This is a real problem that I noticed while preparing these slides.

Acronyms MTA = Mail Transfer Agent MUA = Mail User Agent
The software that routes message from host to host (Sendmail, Postfix, Qmail, Exchange (cough)) MUA = Mail User Agent The software that lets users send and receive (Outlook, Eudora, etc) PBCK = Problem Between Chair and Keyboard A user. See also “DFU”

Mail Routing I tap in into my MUA. What happens? MUA hands message off to local MTA Local MTA uses the DNS to look up MX records for brunel.ac.uk MX record?

MX Records Are entries in the DNS
Unlike most other DNS entries (A records, etc), they contain two pieces of information A FQDN A weight / preference A domain (brunel.ac.uk) may have multiple MX records, listing different FQDNs and weights, providing redundancy Hosts acting as MXs for a domain do not need to be in the same domain as the domain they are acting as MXs for (!)

Brunel and Citigroup MX records
Weight Host 4 eros.brunel.ac.uk 5 nemesis.brunel.ac.uk Weight Host 50 mail1.citigroup.com mail2.citigroup.com mail3.citigroup.com mail4.citigroup.com mail5.ssmb.com Anything trying to send mail to will try to send it via eros.brunel.ac.uk first. If that doesn’t work (either eros is not available, or it returns a 4xx temporary failure error) then it will try talking to nemesis.brunel.ac.uk. If that temp. fails then the message will be retried. If it permanent fails at any point then the message is bounced. Anything trying to send mail to will randomise the list of hosts (mail1 thru mail5) because the MX weights are identical, and then contact them in turn.

Mail Routing (cont.) The local MTA sorts the MX results in order of their weight (lowest first) It does a DNS lookup for the IP address(es) of the first FQDN in the list It tries to connect to that IP address on port 25 If the connection succeeds it tries to deliver the message If the connection fails, or the delivery attempt failed with a temporary error, it tries again, with the next MX record in the list

Mail Routing (cont.) The MTA will queue messages for a period of time (5 days is typical) It will make regular attempts to re-deliver messages that generated temporary failures Failure after a certain period (normally 4 hours) may generate a “We are still trying to deliver your message” note to the envelope sender address Messages that generate a permanent failure from any of the MX hosts are not retried, and are bounced Bounces go to the envelope sender address, not the From: address

Citigroup Mail Backbone Structure
Internet Anti-spam Anti-virus Archiving It’s very common, especially in large environments, to have layers of mail servers. The anti-spam boxes are the mail1 thru mail5 hosts in the citigroup.com MX records. The other layers are not visible to the Internet. Each layer has multiple hosts in it, providing redundancy and additional performance, and each layer needs to know how to reach the next layer in the chain, without needing to know how the whole structure is organised. There are some problems with this approach. For example, it’s only at the address re-writing stage that we know whether or not the recipient address is valid. So if it’s invalid we need to generate a bounce message inside, and propogate it back out. This could be fixed by making sure that the anti-spam servers know which addresses are valid, and which ones aren’t, and that’s something that we’re working on. Address re-writing Exchange Servers

IP Characteristics of SMTP
SMTP servers listen on port 25 Always uses TCP Relatively long communication lifespan TCP overhead is acceptable TCP ensures packets are resent as necessary

Extending SMTP Turns out that, as originally specified, SMTP doesn’t do some useful things So ESMTP was invented But how do you do this without breaking all the existing implementations? Hmm…

Extending SMTP (cont.) Get out clause in the original SMTP spec
If an SMTP server receives a command it doesn’t understand, it: Does not drop the connection Returns an error code (5xx) Pretends it never received the command Robustness in action, and a stroke of genius

Extending SMTP (cont.) EHLO - Extended HELO
Replaces ‘HELO’ in the beginning of the SMTP spec If a server responds to EHLO with a 2xx code you know it speaks ESMTP If it responds with a 5xx code then you fall back to regular SMTP, and immediately send a HELO.

EHLO in action 220 issaspam-ny01.ssmb.com ESMTP Go ahead EHLO ngo.dnsalias.org 250-issaspam-ny01.ssmb.com Hello 250-ENHANCEDSTATUSCODES 250-PIPELINING 250-8BITMIME 250-SIZE DSN 250-DELIVERBY 250 HELP MAIL FROM: Here’s EHLO in action. Issaspam-ny01 understands ESMTP, so it responds to the EHLO with a 250 status code (indicating success). It then sends back a series of lines advertising which additional ESMTP functionality it supports. Notice that all but the last line is digit-digit-digit-dash, not digit-digit-digit-space. The dash indicates that there’s more data to come. Amongst other things that this server says it supports are 8 bit messages (encoded using MIME), Delivery Status Notifications (DSNs), which allow for things like ‘has the recipient read the message’ to be returned back to the sender. The system also advertises the maximum message size it’s prepared to accept, in this case, 25MB.

EHLO failing 220 smtp.example.com EHLO ngo.dnsalias.org 502 Error: command not implemented HELO ngo.dnsalias.org 250 OK MAIL FROM: This is what happens if the remote host doesn’t support ESMTP. It fails to understand the EHLO, returning a 502 error. But it doesn’t close the connection, so the sending server can then fall back to regular SMTP, and send the message that way.

A better way of solving the problem
Always embed version information in to your protocols The version should be the first piece of information in any transaction Defines the format of the rest of the transaction But, still allow unimplemented commands to fail gracefully

Nice things about SMTP It’s distributed from the get-go, and it scales
Need more servers? Add them, and update your MX records It’s open and royalty free SMTP is fully documented in RFC2821 Message format is in RFC2822 Heterogenous Nothing in SMTP ties it to a particular platform

More nice things about SMTP
It’s resilient, and failures are handled MX server not responding? Go try another one Are they all down? Wait a bit, and try again It distinguishes between temporary errors Disk’s full, I can’t accept any mail at the moment, so try again letter And permanent errors The address you’ve provided is invalid, I’m never going to be able to deliver it. Hides implementation details from the user User doesn’t need to know the route the message takes

Nice things about SMTP..? Secure? Nobody’s perfect Not really
Relatively simple to forge mail Harder to forge it perfectly Does not address encryption or authentication of message contents Nobody’s perfect

Thanks Questions?

Bonus Slides

Things I wish I knew 10 years ago
Work for a small company You learn a lot very quickly The hours can be insane You can accomplish a lot very fast Work for a large company You tend to specialise Regular hours Bureacracy is ever-present

More things to know Attend conferences Travel whenever possible
You learn a lot The networking (people kind) is invaluable Speaking at them is great for the CV It also forces you to think clearly about a subject Never neglect the social side Travel whenever possible San Francisco is great in the summer

Still more things to know
Always be aware of the Peter Principle Read “The Mythical Man Month”, Brooks Learn the Perl programming language Stay up to date with the technical journals Find time to have a life

Pseudo-code for a server
int s; // The socket handle sockaddr_t addr; // The socket address int client; // Address info of the client addr.sin_port = 80; // We’ll listen on port 80 s = socket(AF_INET, SOCK_STREAM, 0); // Create socket // Assign the address info we specify to the socket bind(s, &addr, sizeof(sockaddr_t)); listen(s, 5); // 5 incoming connections at once while(accept(s, &addr, &client)) { // If we’re here then something’s connected to us. // Do whatever we’re supposed to do when this happens }

Pseudo-code for a client
int s; // The socket handle sockaddr_t addr; // The socket address struct hostent *he; // Info about the remote host s = socket(AF_INET, SOCK_STREAM, 0); // Create socket // Get the IP address of the host we want to connect to he = gethostbyname(“ // Store the IP address, and the port we connect to addr.sin_addr.s_addr = *((int *) he->h_addr_list[0]); addr.sin_port = 80; if(connect(s, &addr, sizeof(addr)) == 0) { // Connected to the remote host. // … close(s); // All done }

me.example.com

dns.example.com

Root Nameserver

.org Nameserver

ns.freebsd.org Nameserver

The Internet: A Distributed System

Similar presentations

Presentation on theme: "The Internet: A Distributed System"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Internet: A Distributed System

Similar presentations

Presentation on theme: "The Internet: A Distributed System"— Presentation transcript:

Similar presentations

About project

Feedback