Dovecot IMAP Server http://www.dovecot.org/ Timo Sirainen August 2008.

Slides:



Advertisements
Similar presentations
ITIS 3110 Jason Watson. Replication methods o Primary/Backup o Master/Slave o Multi-master Load-balancing methods o DNS Round-Robin o Reverse Proxy.
Advertisements

Chapter 11: File System Implementation
File System Implementation
Server Upgrade From UW to Cyrus. What is an IMAP Server? Provides access to your mail messages stored on the mail server Requires authentication.
1 - Oracle Server Architecture Overview
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
Chapter 3.7 Memory and I/O Systems. 2 Memory Management Only applies to languages with explicit memory management (C or C++) Memory problems are one of.
The Google File System.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Backup and Recovery Part 1.
Architecture of SMTP, POP, IMAP, MIME.
File Systems (2). Readings r Silbershatz et al: 11.8.
NovaBACKUP 10 xSP Technical Training By: Nathan Fouarge
File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
Electronic Mail (SMTP, POP, IMAP, MIME)
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Git for Version Control These slides are heavily based on slides created by Ruth Anderson for CSE 390a. Thanks, Ruth! images taken from
Implementing POP3 and IMAP4 Using Dovecot
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Dovecot IMAP Server Date: September, 2009.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Networked File System CS Introduction to Operating Systems.
Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –
1 Chapter 12 File Management Systems. 2 Systems Architecture Chapter 12.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Copyright ®xSpring Pte Ltd, All rights reserved Versions DateVersionDescriptionAuthor May First version. Modified from Enterprise edition.NBL.
Dovecot IMAP Server Date: July, 2009.
Distributed File Systems
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
August 1, 2008IETF 72 - Dublin, Ireland1 Revising QRESYNC (RFC 5162) Timo Sirainen Alexey Melnikov.
Chapter 3.5 Memory and I/O Systems. 2 Memory Management Memory problems are one of the leading causes of bugs in programs (60-80%) MUCH worse in languages.
Information Systems and Network Engineering Laboratory II DR. KEN COSH WEEK 1.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
December 6, 2007IETF 70 - Vancouver, Canada1 Lemonade Interop event in Munich.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
What's new in the World IMAP/LEMONADE/SIEVE (no DKIM or EAI) Alexey Melnikov.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
DNS DNS overview DNS operation DNS zones. DNS Overview Name to IP address lookup service based on Domain Names Some DNS servers hold name and address.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
CS333 Intro to Operating Systems Jonathan Walpole.
 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture Chunkservers Master Consistency Model File Mutation Garbage.
Post Office Protocol.
Introduction to AFS IMSA Intersession 2003 An Overview of AFS Brian Sebby, IMSA ’96 Copyright 2003 by Brian Sebby, Copies of these slides.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Netprog: Client/Server Issues1 Issues in Client/Server Programming Refs: Chapter 27.
Retele de senzori Curs 2 - 1st edition UNIVERSITATEA „ TRANSILVANIA ” DIN BRAŞOV FACULTATEA DE INGINERIE ELECTRICĂ ŞI ŞTIINŢA CALCULATOARELOR.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
Implementing POP3 and IMAP4 Using Dovecot AfNOG 2016 Scalable Internet Services (SS-E) Gaborone, Botswana Presented by Michuki Mwangi (Built on materials.
Jonathan Walpole Computer Science Portland State University
Module 11: File Structure
Transactions and Reliability
Chapter 14: System Protection
CSE-291 (Cloud Computing) Fall 2016
Google File System CSE 454 From paper by Ghemawat, Gobioff & Leung.
Implementing POP3 and IMAP4 Using Dovecot
Chat Refs: RFC 1459 (IRC).
Issues in Client/Server Programming
Chapter 2: Operating-System Structures
William Stallings Data and Computer Communications
THE GOOGLE FILE SYSTEM.
Chapter 2: Operating-System Structures
Presentation transcript:

Dovecot IMAP Server http://www.dovecot.org/ Timo Sirainen August 2008

Dovecot Pictures from Wikipedia, by Cyril Thomas and Carcharoth

Features Often has better performance than competition. Optimized for minimizing disk I/O (index/cache files) Highly configurable for different environments Support for standard mbox and maildir formats, as well as a new Dovecot-specific high-performance dbox format Supports NFS and clustered filesystems, soon support for internal multi-master replication Extremely flexible authentication Postfix and Exim support Dovecot for SMTP AUTH Admin-friendly / self-healing All errors are logged Understandable error messages Detected (index) corruption gets fixed automatically

History Dovecot design was started around June 2002 Why? First release was July 2002 Late 2003 a redesign started v1.0.0 released April 13th 2007 v1.1.0 released June 21st 2008 v1.2 dev tree already has a lot of new features 95% of code is written by me – others have mostly written authentication related code

Development All discussions in mailing list I try to answer all questions others don’t answer Mercurial for version control Distributed VCS should make it easier for others to contribute code Currently no bug tracking system I fear it would make my life more difficult BTS that fully integrated with mailing list would be nice

Code Design Written with C language Uses several Dovecot-specific APIs to make coding easier and more difficult to cause security holes Memory pools, data stack – avoid free() Buffers, strings and type-safe arrays Stackable input/output streams Some say it’s very unlike any other C code Prefers to (assert-)crash rather than continue with possibly bad state Unit tests are slowly being added.. Help would be appreciated. 

Dovecot Processes IMAP command: LOGIN username password Forward username and password to auth process Success/Failure reply (reason isn’t returned – see log for that) “Log me in” request – TCP socket fd sent via UNIX socket Auth verification (to make sure pre-login didn’t fake it) Returns userdb fields (home, UNIX UID & GID, etc.) or “Internal failure” (practically never) a) Returns success / failure – pre-login stops IMAP processing b) IMAP process forked & fd transferred IMAP reply: OK Logged in.

Authentication Authentication mechanisms: PLAIN, CRAM-MD5, DIGEST-MD5, Kerberos, etc. Password schemes: Plaintext, CRYPT, MD5, SHA1, SHA256, SSHA, etc. Password databases: User <-> password mapping mostly (PAM, SQL, LDAP, etc) User databases: User’s home dir, UNIX UID&GID, other settings like quota (passwd, SQL, LDAP, etc.) Passdb/userdb separation allows e.g. passdb PAM + userdb LDAP or passdb SQL + userdb static Support for multiple dbs: Support for both system (passwd) and virtual (e.g. SQL) users (or for any other reason) SQL/LDAP lookups are fully configurable

IMAP Protocol Base protocol is complex – difficult to implement it correctly (both client & server) Flexible – many different ways to implement a client (online & offline – defined later) Extensible – there are a lot of extensions. IETF groups: imapext created many extensions over many years (ACL, SORT, THREAD, etc). Shut down on June 2008. Lemonade contains many extensions mainly intended for mobile clients (forward-without-download, etc) Message Organizing (morg) group is starting up (e.g. multi-mailbox search, mailbox metadata, new comparator, etc) Talks about a simplified IMAP5 protocol have started

Dovecot’s IMAP Extensions v1.0: SASL-IR SORT THREAD=REFERENCES MULTIAPPEND UNSELECT LITERAL+ IDLE CHILDREN NAMESPACE v1.1: UIDPLUS LIST-EXTENDED I18NLEVEL=1 STATUS-IN-LIST (draft) v1.2: CONDSTORE QRESYNC WITHIN ID SEARCHRES SEARCH=INTHREAD ESEARCH Future: Lemonade extensions (CATENATE, URLAUTH, NOTIFY, ..)

ImapTest IMAP server tester Written originally for Dovecot stress testing Found a lot of crashes, hangs and mailbox corruption on other IMAP servers as well Tests IMAP server compliance with static tests and dynamic random stress testing. Dovecot is currently the only IMAP server that fully passes all of ImapTest tests. Most other servers fail in many different ways. “Professional” IMAP servers from large companies are among the worst. http://imapwiki.org/ImapTest

IMAP Server Performance Difficult to benchmark Depends a lot on clients (online vs. offline – more on next slides) What data to index/cache?

Offline clients Typically downloads the newly seen messages’ bodies once and caches them locally Often can be configured to download immediately vs. download when reading Some use server side searches (Thunderbird) and some don’t (Outlook – if some messages haven’t been downloaded, those aren’t searched) Usually also fetch messages’ metadata once (headers, received date) Caching may help, but not that much

Online clients Webmails often keep asking for the same information over and over and over again Pine and some webmails cache what they’ve already seen, but not permanently Mutt (without local cache) and some others fetch all messages’ metadata every time when opening a mailbox Caching is very useful, but different clients want different metadata

Dovecot Cache File Dynamic: caches only what clients want. Specific message headers (From:, Subject:, etc) Message MIME structure information Message sent / received date etc. Caching decisions for each field: “no”, “temporary”, “permanent” Unused fields dropped after a month. Cached data never changes (IMAP guarantees) Cache file gets “compressed” once in a while Often about 10-20% of mailbox size

Dovecot Index Files dovecot.index contains current metadata Fixed size records only, one per message IMAP Unique ID number (UID) identifies messages Flags (\Seen, \Answered, etc.) Keywords (aka. tags, labels,custom flags) as a bitmask (optimized for few keywords) Extension data: mbox file offsets, cache file offsets, modseq number (v1.2 CONDSTORE), etc. Lazily created/updated since v1.1 dovecot.index.log has all the latest changes. dovecot.index is updated after 1 kB of new data has been written to the .log

Dovecot Index Files dovecot.index.log contains transaction log Somewhat similar to databases’ transaction logs or filesystem journals. Contains all changes to be done to dovecot.index. After dovecot.index is read once, Dovecot usually never reads it again but only updates the in-memory copy from dovecot.index.log Very efficient with NFS / clustered filesystems!

Locking Dovecot uses several techniques to avoid traditional read/write locking (no waiting!) dovecot.index.log is currently write-locked when writing, reads are lockless O_APPEND could be used to make write-lockless dovecot.index is read-locked. If write-locking fails, the file is recreated instead of waiting. dovecot.index.cache does short write locks to reserve space. Reads are lockless. Maildir syncing requires locking (or inotify)

Plugins Dovecot plugins can hook into almost anything and modify Dovecot’s behavior Access Control Lists Quota Full text search indexes Reading gzip-compressed mboxes/maildir files Can add new IMAP commands (although enhancing existing commands could use more work) Implement new mail storage backends (virtual, SQL, IMAP proxying)

Mailbox Formats mbox Maildir dbox Oldest format, widely supported One mailbox = one file Slow to delete messages from the middle Maildir One file = one message Fast to delete messages Slow(er) to read through all messages dbox Dovecot’s extensible and high-peformance mailbox format

Dbox Mailbox Format Either one file = one message Lockless reads Main difference to Maildir: file name doesn’t change Or one file = multiple messages Some locking necessary for reads A new file is created when old one grows above configured size (e.g. 2 MB) or when the file is older than n days (useful for incremental backups) Changing used file size changes read/delete performance Not fully implemented yet

Dbox Mailbox Format Primary metadata storage is Dovecot’s index files Metadata is backed up to dbox files about once a day, so if indexes are lost, all flags won’t get lost Messages’ metadata is extensible with arbitrary key=value pairs. This will useful in future: Separating attachments to a single instance storage Storing messages compressed Extremely easy and fast migration from Maildir Compatibility mode: Rename cur/ to dbox dir, move files in new/ and metadata files

Multi-Master Replication Necessary? Not possible to implement reliably with low-level replication because of IMAP Unique message ID (UIDs) IMAP UIDs are increasing 32bit numbers Global sync required or conflicts will happen Conflicts always possible with M-M replication, but fixing not possible with low-level replication

Multi-Master Replication Goals Synchronous operation: Never lose even a single mail (if 1..n replicas die) Performance should be good in all-active multi-master setup Desynchronizations happen: Fix them and conflicts caused by syncing automatically and efficiently

Multi-Master Replication Saving mails (the most critical part) Expunging mails Updating message flags/keywords CONDSTORE extension: Updating modification sequences (modseqs) Per-msg modseq increases on every flag etc. change Modseqs are also very useful for replication Mailbox creates, renames, deletes, etc. Two very different data: Potentially huge message bodies vs. small metadata

Replication Parts 3 mostly separate parts: Incremental mailbox sync Fixing a (large) mailbox desync Syncing mailbox list (mailbox creates, deletes, renames) Implemented in different stages (1-3). Incremental mailbox sync is the most difficult to get working correctly and fast.

Replication Master IMAP UIDs must be globally growing -> UIDs can be allocated only by a “mailbox master” server Master may move between servers (and at least initially it always will if a server wants to save a new mail) Master can also handle CONDSTORE extension’s STORE UNCHANGEDSINCE. If network dies between two servers, both may allocate the same UID -> UID conflict that must be fixed later when servers see each others again

Replication Processes Simpler to have separate processes for separate tasks. Better security: Less code that has write access to users’ mailboxes Worker processes when there can be waiting on locks, so work still continues elsewhere