SIP Server Scalability IRT Internal Seminar Kundan Singh, Henning Schulzrinne and Jonathan Lennox May 10, 2005.

SIP Server Scalability IRT Internal Seminar Kundan Singh, Henning Schulzrinne and Jonathan Lennox May 10, 2005

2 Agenda Why do we need scalability? Scaling the server SIP express router (Iptel.org) SIPd (Columbia University) Threads/Processes/Events Scaling using load sharing DNS-based, Identifier-based Two stage architecture Conclusions 27 slides

3 DB Internet telephony (SIP: Session Initiation Protocol) bob@example.com alice@yahoo.com yahoo.comexample.com REGISTER INVITE 192.1.2.4129.1.2.3 DNS

4 Scalability Requirements Depends on role in the network architecture GW MG IP network PSTN SIP/PSTN SIP/MGC Carrier network ISP Cybercafe IP PSTN GW PBX IP phones PSTN phones T1 PRI/BRI Edge ISP server 10,000 customers Carrier (3G) 10 million customers Enterprise server 1000 customers

5 Scalability Requirements Depends on traffic type Registration (uniform) Authentication, mobile users Call routing (Poisson) stateful vs stateless proxy, redirect, programmable scripts Beyond telephony (Don’t know) Instant message, presence (including sensors), device control Stateful calls (Poisson arrival, exponential call duration) Firewall, conference, voicemail Transport type UDP/TCP/TLS (cost of security)

6 SIPstone SIP server performance metrics Steady state rate for successful registration, forwarding and unsuccessful call attempts measured using 15 min test runs. Measure: #requests/s with given delay constraint. Performance=f(#user,#DNS,UDP/TCP,g(request),L) where g=type and arrival pdf (#request/s), L=logging? For register, outbound proxy, redirect, proxy480, proxy200. Parameters Measurement interval, transaction response time, RPS (registers/s), CPS (calls/s), transaction failure probability<5%, Delay budget: R1 < 500 ms, R2 < 2000 ms Shortcomings: does not consider forking, scripting, Via header, packet size, different call rates, SSL. Is there linear combination of results? Whitebox measurements: turnaround time Extend to SIMPLEstone LoaderHandler REGISTER Server INVITE 180 Ringing 100 Trying 200 OK ACK BYE SQL database R2 R1

7 SIP server What happens inside a proxy? recvfrom or accept/recv Match transaction Modify response Match transaction Update DB Lookup DB Build response Modify Request DNS sendto, send or sendmsg parse Request Response Stateless proxy Found Stateless proxy stateful REGISTER other Redirect/reject Proxy (Blocking) I/O Critical section (lock) Critical section (r/w lock)

8 Lessons Learnt (sipd) In-memory database Call routing involves (  1) contact lookups 10 ms per query (approx) Cache (FastSQL) Loading entire database is easy Periodic refresh Potentially useful for DNS lookups SQL database Cache Periodic Refresh < 1 ms [2002:Narayanan] Single CPU Sun Ultra10 Turnaround time vs RPS Web config

9 Lessons Learnt (sipd) Thread-per-request does not scale One thread per message Doesn’t scale Too many threads over a short timescale Stateless: 2-4 threads per transaction Stateful: 30s holding time Thread pool + queue Thread overhead less; more useful processing Pre-fork processes for SIP-CGI Overload management Graceful failure, drop requests over responses Not enough if holding time is high Each request holds (blocks) a thread R1 R2 R3 R4 Incoming Requests R1-4 Load Throughput Incoming Requests R1-4 Fixed number of threads Thread pool with overload control Thread per request

10 What is the best architecture? Event-based Reactive system Process pool Each pool process receives and processes to the end (SER) Thread pool 1. Receive and hand-over to pool thread (sipd) 2. Each pool thread receives and processes to the end 3. Staged event-driven: each stage has a thread pool recvfrom or accept/recv Match transaction Modify response Match transaction Update DB Lookup DB Build response Modify Request DNS sendto, send or sendmsg parse Request Response Stateless proxy Found Stateless proxy stateful REGISTER other Redirect/reject Proxy

11 Stateless proxy UDP, no DNS, six messages per call recvfrom or accept/recv Match transaction Modify response Match transaction Update DB Lookup DB Build response Modify Request DNS sendto, send or sendmsg parse Request Response Stateless proxy Found Stateless proxy stateful REGISTER other Redirect/reject Proxy

12 Stateless proxy UDP, no DNS, six messages per call Architecture /Hardware 1 PentiumIV 3GHz, 1GB, Linux2.4.20 (CPS) 4 pentium, 450MHz, 512 MB, Linux2.4.20 (CPS) 1 ultraSparc-IIi, 300 MHz, 64MB, Solaris (CPS) 2 ultraSparc-II, 300 MHz, 256MB, Solaris (CPS) Event-based1650370150190 Thread/msg1400TBD100TBD Thread-pool11450600 (?)110220 (?) Thread-pool216001150 (?)152TBD Process-pool17001400160350

13 Stateful proxy UDP, no DNS, eight messages per call Event-based single thread: socket listener + scheduler/timer Thread-per-message pool_schedule => pthread_create Thread-pool1 (sipd) Thread-pool2 N event-based threads Each handles specific subset of requests (hash(call-id)) Receive & hand over to the correct thread poll in multiple threads => bad on multi-CPU Process pool Not finished yet

14 Stateful proxy UDP, no DNS, eight messages per call Architecture /Hardware 1 PentiumIV 3GHz, 1GB, Linux2.4.20 (CPS) 4 pentium, 450MHz, 512 MB, Linux2.4.20 (CPS) 1 ultraSparc-IIi, 360MHz, 256 MB, Solaris5.9 (CPS) 2 ultraSparc-II, 300 MHz, 256 MB, Solaris5.8 (CPS) Event-based1200300160 Thread/msg65017590120 Thread-pool1950340 (p=4)120120 (p=4) Thread-pool21100500 (p=4)155200 (p=4) Process-pool----

15 Lessons Learnt What is the best architecture? Stateless CPU is bottleneck Memory is constant Process pool is the best Event-based not good for multi-CPU Thread/msg and thread-pool similar Thread-pool2 close to process-poll Stateful Memory can become bottle-neck Thread-pool2 is good But not N x CPU Not good if P  CPU Process pool may be better (?)

16 Lessons Learnt (sipd) Avoid blocking function calls DNS 10-25 ms (29 queries) Cache 110 to 900 CPS Internal vs external non-blocking Logger Lazy logger as a separate thread Date formatter Strftime() 10% REG processing Update date variable every second random32() Cache gethostid()- 37  s Logger: while (1) { lock; writeall; unlock; sleep; }

17 Lessons Learnt (sipd) Resource management Socket management Problems: OS limit (1024), “liveness” detection, retransmission One socket per transaction does not scale Global socket if downstream server is alive, soft state – works for UDP Hard for TCP/TLS – apply connection reuse Socket buffer size 64KB to 128KB; Tradeoff: memory per socket vs number of sockets Memory management Problems: too many malloc / free, leaks Memory pool Transaction specific memory, free once; also, less memcpy About 30% performance gain Stateful: 650 to 800 CPS; Stateless: 900 to 1200 CPS Stateless processing time (  s ) INV180200ACKBYE200REG200 W/o mempool 15567 951396223770 W/ mempool 1114948641064120248 Improvement (%) 2827283324341531

18 Lessons Learnt (SER) Optimizations Reduce copying and string operations Data lumps, counted strings (+5-10%) Reduce URI comparison to local User part as a keyword, use r2 parameters Parser Lazy parsing (2-6x), incremental parsing 32-bit header parser (2-3.5x) Use padding to align Fast for general case (canonicalized) Case compare Hash-table, sixth bit Database Cache is divided into domains for locking [2003:Jan Janak] SIP proxy server effectiveness, Master’s thesis, Czech Technical University

19 Lessons Learnt (SER) Protocol bottlenecks and other scalability concerns Protocol bottlenecks Parsing Order of headers Host names vs IP address Line folding Scattered headers (Via, Route) Authentication Reuse credentials in subsequent requests TCP Message length unknown until Content-Length Other scalability concerns Configuration: broken digest client, wrong password, wrong expires Overuse of features Use stateless instead of stateful if possible Record route only when needed Avoid outbound proxy if possible

20 Load Sharing Distribute load among multiple servers Single server scalability There is a maximum capacity limit Multiple servers DNS-based Identifier-based Network address translation Same IP address

21 Load Sharing (DNS-based) Redundant proxies and databases REGISTER Write to D1 & D2 INVITE Read from D1 or D2 Database write/ synchronization traffic becomes bottleneck D1 D2 P1 P2 P3 REGISTER INVITE

22 Load Sharing (Identifier-based) Divide the user space Proxy and database on the same host First-stage proxy may get overloaded Use many Hashing Static vs dynamic D1 D2 P1 P2 P3 D3 a-h i-q r-z

23 Load Sharing Comparison of the two designs ((tr/D)+1)TN = (A/D) + B ((tr+1)/D)TN = (A/D) + (B/D) D1 D2 P1 P2 P3 D1 D2 P1 P2 P3 D2 a-h i-q r-z Total time per DB D = number of database servers N = number of writes (REGISTER) r = #reads/#writes = (INV+REG)/REG T = write latency t = read latency/write latency Low reliability High scale

24 Scalability (and Reliability) Two stage architecture for CINEMA MasterSlaveMasterSlave sip:bob@example.com sip:bob@b.example.com s1 s2 s3 a1 a2 b1 b2 a*@example.com b*@example.com example.com _sip._udp SRV 0 40 s1.example.com SRV 0 40 s2.example.com SRV 0 20 s3.example.com SRV 1 0 ex.backup.com a.example.com _sip._udp SRV 0 0 a1.example.com SRV 1 0 a2.example.com b.example.com _sip._udp SRV 0 0 b1.example.com SRV 1 0 b2.example.com Request-rate = f(#stateless, #groups) Bottleneck: CPU, memory, bandwidth? ex

25 Load Sharing Result (UDP, stateless, no DNS, no mempool) S P CPS 3 3 2800 2 3 2100 2 2 1800 1 2 1050 0 1 900

26 Lessons Learnt Load sharing Non-uniform distribution Identifier distribution (bad hash function) Call distribution => dynamically adjust Stateless proxy S=1050, P=900 CPS S3P3 => 10 million BHCA (busy hour call attempts) Stateful proxy S=800, P=650 CPS Registration (no auth) S=2500, P=2400 RPS S3P3 => 10 million subscribers (1 hour refresh) Memory pool and thread-pool2/event-based further increase the capacity (approx 1.8x)

27 Conclusions and future work Server scalability Non-blocking, process/events/thread, resource management, optimizations Load sharing DNS, Identifier, two-stage Current and future work: Measure process pool performance for stateful Optimize sipd Use thread-pool2/event-based (?) Memory - use counted strings; clean after 200 (?) CPU - use hash tables Presence, call stateful and TLS performance (Vishal and Eilon)

Backup slides

29 Telephone scalability (PSTN: Public Switched Telephone Network) “bearer” network telephone switch (SSP) database (SCP) for freephone, calling card, … signaling network (SS7) signaling router (STP) local telephone switch (class 5 switch) 10,000 customers 20,000 calls/hour database (SCP) 10 million customers 2 million lookups/hour signaling router (STP) 1 million customers 1.5 million calls/hour regional telephone switch (class 4 switch) 100,000 customers 150,000 calls/hour

30 SIP server Comparison with HTTP server Signaling (vs data) bound No File I/O (exception: scripts, logging) No caching; DB read and write frequency are comparable Transactions Stateful wait for response Depends on external entities DNS, SQL database Transport UDP in addition to TCP/TLS Goals Carrier class scaling using commodity hardware Try not to customize/recompile OS or implement (parts of) server in kernel (khttpd, AFPA)

31 Related work Scalability for (web) servers Existing work Connection dispatcher Content/session-based redirection DNS-based load sharing HTTP vs SIP UDP+TCP, signaling not bandwidth intensive, no caching of response, read/write ratio is comparable for DB SIP scalability bottleneck Signaling (chapter 4), real-time media data, gateway 302 redirect to less loaded server, REFER session to another location, signal upstream to reduce

32 Related work 3GPP (release 5)’s IP Multimedia core network Subsystem uses SIP Proxy-CSCF (call session control function) First contact in visited network. 911 lookup. Dialplan. Interrogating-CSCF First contact in operator’s network. Locate S-CSCF for register Serving-CSCF User policy and privileges, session control service Registrar Connection to PSTN MGCF and MGW

33 Server-based vs peer-to-peer Server-based vs peer-to-peer Reliability, failover latency DNS-based. Depends on client retry timeout, DB replication latency, registration refresh interval DHT self organization and periodic registration refresh. Depends on client timeout, registration refresh interval. Scalability, number of users Depends on number of servers in the two stages. Depends on refresh rate, join/leave rate, uptime Call setup latency One or two steps.O(log(N)) steps. SecurityTLS, digest authentication, S/MIME Additionally needs a reputation system, working around spy nodes Maintenance, configuration Administrator: DNS, database, middle-box Automatic: one time bootstrap node addresses PSTN interoperability Gateways, TRIP, ENUMInteract with server-based infrastructure or co-locate peer node with the gateway

34 Comparison of sipd and SER sipd Thread pool Events (reactive system) Memory pool PentiumIV 3GHz, 1GB, 1200 CPS, 2400 RPS (no auth) SER Process pool Custom memory management PentiumIII 850 MHz, 512 MB => 2000 CPS, 1800 RPS

SIP Server Scalability IRT Internal Seminar Kundan Singh, Henning Schulzrinne and Jonathan Lennox May 10, 2005.

Similar presentations

Presentation on theme: "SIP Server Scalability IRT Internal Seminar Kundan Singh, Henning Schulzrinne and Jonathan Lennox May 10, 2005."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SIP Server Scalability IRT Internal Seminar Kundan Singh, Henning Schulzrinne and Jonathan Lennox May 10, 2005.

Similar presentations

Presentation on theme: "SIP Server Scalability IRT Internal Seminar Kundan Singh, Henning Schulzrinne and Jonathan Lennox May 10, 2005."— Presentation transcript:

Similar presentations

About project

Feedback