Presentation is loading. Please wait.

Presentation is loading. Please wait.

VoIP Testing—A How-to Session for Performance and Functional Test Methodologies Chris Bajorek, Director, CT Labs.

Similar presentations


Presentation on theme: "VoIP Testing—A How-to Session for Performance and Functional Test Methodologies Chris Bajorek, Director, CT Labs."— Presentation transcript:

1

2

3 VoIP Testing—A How-to Session for Performance and Functional Test Methodologies Chris Bajorek, Director, CT Labs

4 3 Before We Start Every year we consult with many companies, helping them to perform many different types of VoIP- oriented tests This provides a unique industry perspective on the market readiness of a wide range of VoIP products I’m pleased to have this opportunity to share our test experiences with you today

5 4 VoIP Products by Market Area Residential (Voice over Broadband) –Analog terminal adapters, VoIP softphones, residential routers Enterprise –IP PBXs, IP Contact Centers, VoIP phones & softphones, firewalls/ALGs, intrusion prevention devices, media servers (conferencing, voice mail, IVR) Next-Gen Network Carriers and Service Providers –Session border controllers, softswitches, media servers, proxys, media gateways, VQ enhancement processors

6 5 Building VoIP Networks: IMS is here and it needs testing. Key elements of IMS: –Enables innovative new applications –High levels of network complexity –Modules from multiple vendors must peacefully coexist –High rate of carrier adoption –Global deployments –Standards based –Exploits strengths of IP+SIP

7 6 IMS Basics: Functions by Layer Services / Application Layer –Application servers, Media servers Control / Switching Layer –HSS (Home Subscriber Server) –CSCF (Call Session Control Function) –BGCF (Breakout Gateway Control Function) –MGCF (Media Gateway Control Function) –MRFC (Media Resource Function Control) –etc. Transport / Access Layer –IP/MPLS, PSTN/PLMN, Cellular, SONET/SDH, ATM, Satellite

8 7 Risks of Inadequate Testing From the CT Labs VoIP project files: –VoIP terminal adapters that act unreliable and “emulate” an occasional bad Internet connection –IP PBX’s that drop calls when subjected to only certain types of call loads –VoIP soft clients that distort the caller audio –High-end enterprise firewalls that grind to a standstill under certain denial-of-service attacks –Session border controllers that degrade voice quality at traffic levels below rated maximums

9 8 Test Automation- Reaping the Benefits of Shorter Test Cycles Test Automation- Why you should consider using it sooner, not later

10 9 Test Automation—The Benefits –Tightly controlled test environment All aspects of the test setup can be controlled and coordinated by testing scripts –Repeatable results Key to resolving issues that arise during testing Includes ability to exactly reproduce product settings and test conditions –Faster test execution Weeks of manual testing can literally be executed in Hours –Increased accuracy of results reporting –All of the above resulting in: Lower testing costs over product’s lifetime Greater product and delivered-service reliability Fewer field failures, fewer customer-reported issues

11 10 Challenges Using Live Callers in Tests –The exact timing and sequence of caller actions is not synchronized or repeatable –Ability to distinguish and describe nuances of results varies widely from person to person i.e. reliability of reported results can be low –Ability to correlate assessment of voice quality and anomalies across multiple listeners is typically poor Unless you just happen to know how to run ITU-T P.800 MOS tests –Call arrival profiles difficult to control when using large numbers of callers for “load tests” –In other words, don’t expect more than coarse results

12 11 Conference test, via automation Conf #1 Conf #3 Conf #2 2 4 3 23 4 76 5 3 2 H H H Ready Then go thru a defined sequence of talker / listener subtests while under real call loads… …while monitoring actual voice quality for each listener

13 12 Automation-based VoIP Testing Goals 1.Verify call-handling performance 2.Verify voice quality –With a wide variety of caller and noise environments 3.Verify performance under real-world traffic and network impairment conditions 4.Verify performance under malicious attack conditions 5.Verify service reliability –i.e. Availability of service under extended test run durations 6.Verify interoperability and feature interaction 7.Verify quality of access to enhanced services –Applications such as voice mail, conferencing, IVR, etc.

14 13 Real-world automation testing: The 3-phase approach Phase 1: Test with minimal stress in a sterile environment –i.e. no WAN impairments or network traffic, light call loads –This establishes an important performance baseline Phase 2: Test with realistic network traffic and call load conditions Phase 3: Test to rated device call loads

15 14 “Rules of Thumb” that simply do not work “I tested it with 50 calls and the CPU only went to 25%, so we know the device can scale to 200 calls” –Not quite. Our experience shows that in fact most VoIP devices exhibit performance thresholding effects that are not linear and very hard to predict. In other words, after a certain load or capacity limit is reached the device can fail catastrophically. –If you don’t test to full rated capacity, you are playing Russian Roulette with your customers.

16 15 “Rules of Thumb” that simply do not work “We don’t need to test voice quality because we are OEM’ing the software that does that part.” –Dangerous assumption. OEM software typically has many interface points and configuration options and is hardly in and of itself a guarantee of performance. The “glue” code around these objects can still cause voice quality issues.

17 16 Emulation of Network Impairments Perfectly clean networks are not the real world Real networks corrupt the flow of packets in the following time-varying ways: –Packet loss (especially burst loss), packet duplication, and out-of-order packets –Latency and jitter –Restricted bandwidth If you test while inducing these conditions, your product or service will be the cause of far fewer post-deployment issues You can perform both static and dynamic emulation of impairment conditions –Both have value depending on nature of the VoIP device –e.g. IP phone that renegotiates codec type or codec mode when network degrades in mid-call

18 17 Emulation of Network Impairments Firewall / SBC Network Emulator TDM Call Generator VoIP Call Generator DUT The Network Emulator can be controlled via test script for WAN condition changes dynamic

19 18 Adding “Internet Mix” Network Traffic The goal: see the DUT’s impact on VoIP calls when subjected to network traffic at rated capacity Product examples: –Firewalls, intrusion prevention devices, IP phones with integrated switch ports, session border controllers, etc What we do: Generate real session-based “Internet Mix” traffic and measure throughput performance of VoIP calls and IMIX traffic –e.g. http, ftp, P2P, SMTP, POP3, etc –Open source tool: “D-ITG” http://www.grid.unina.it/software/ITG/ http://www.grid.unina.it/software/ITG/ –Notable vendor: Shenick (www.shenick.com)

20 19 Voice Quality Assessment- Automated Testing Techniques Voice and Video Quality Assessment- Automated Testing Techniques

21 20 Voice Quality Test Techniques Automated VQ measurement techniques are designed to estimate the way humans perceive voice quality –MOS live listener tests done per ITU-T P.800 Active versus Passive VQ monitoring –Passive E-Model via packet inspection –Active end-to-end VQ measurement “to the audio wires” –Both techniques have their benefits 1101001 Active end-to-end PESQ VQ Passive E-model VQ

22 21 Active vs Passive VQ Testing Active voice quality testing –Involves evaluation of “received” audio signals as compared to known references i.e. you drive real 2-way calls through the VoIP network –PESQ P.862 (2001) High correlation with standard MOS-LQ subjective tests –Benefits: More accurate, uses mature standards (PESQ) for automated quality assessment –Negatives: Consumes VoIP network resources

23 22 Active vs Passive VQ Testing Passive voice quality testing –Involves passive evaluation of call-based packet flows ITU-T G.107 E-Model Can return estimated MOS-LQ and MOS-CQ scores (Listening versus Conversational) –Benefits: Can be embedded into products and test equipment with relatively low resource footprint –Negatives: Ignores (or models) VoIP endpoint-specific behaviors to network conditions. Vendor implementations can vary.

24 23 How PESQ works Computes a voice quality score by comparing degraded received audio with a “reference” speech prompt –Reference prompts are actual speech clips played during an active test call –Quality scores relate only to the time during the test call when the reference prompts were played with far-end audio being captured –The calculation is not just comparing the reference and degraded waveforms, it is using a human perceptual model to ultimately compute a quality score (1=bad to 4.5=excellent) 60 second test call 8 second reference prompt

25 24 What PESQ VQ Testing is designed for PESQ is a way to quickly and cost-effectively estimate the effects of one-way speech distortion and noise on speech quality PESQ is “endpoint-agnostic” – can be used for VoIP- to-VoIP, VoIP-to-PSTN calls, etc. Strengths –Provides excellent estimate of voice quality –Tests can be performed quickly –Tests are very repeatable

26 25 Passive versus Active VQ : A Real Example From actual CT Labs project In this example, the phone had quality issues that the passive test did not see Being aware of the difference in scoring techniques is critical when debugging reported VQ issues PESQ E-Model

27 26 Video Quality Test Techniques Automated Video quality measurement techniques estimate the way humans perceive picture quality –Live viewer tests done per ITU-T BT.500 Three classes of objective video quality algorithms –Full reference, partial reference, and zero reference Full reference techniques –PSNR (most used), VIM, SSIM. See ITU-T J.144. –Compute intensive, not useful for real time measurements –Software suite available at: http://www.compression.ru Zero reference techniques –Best suited for in-service monitoring –Standards activity continues –Encompasses quality tests for picture, audio, multimedia, and network’s ability to carry streams.

28 27 Load and Stress Testing

29 28 Load and Stress Testing What it is –Verifying the DUT’s performance at rated call and traffic loads –Verify those “theoretical” specs on the data sheet How many simultaneous sessions? It’s all relative –A full load & stress test on a 2-line VoIP terminal adapter will require 2 simultaneous calls –A full load & stress test on a carrier-grade session border controller may support upwards of 150,000 simultaneous SIP calls with media (or more) –The key is this: if you want to be assured of acceptable performance at your spec sheet limits, you cannot linearly scale a partial load test’s results

30 29 Load / Stress Testing: Helpful Hints Use call rates and call ramp profiles that emulate the actual call environment e.g. burst, ramp, etc. Monitor and log DUT platform resources during test –CPU, memory, disk I/O, network I/O can all provide clues as to why a test failed –Capture a periodic snapshot into logfile for post-test run analysis: Windows Perfmon, Linux (various utilities) Synchronize system clocks on DUT and test equipment devices before a test run –Allows failure events to be correlated from logs

31 30 Load / Stress Testing: Pitfalls 1.Temptation is to do high volumes of “simple” calls –Problem with this: it will not exercise internal resources in real world way –Example: Conference bridge load test The wrong way: calls with simple 1-dimensional “can you hear me?” test The right way: multiple conferences of varying sizes with real talker-listener exchanges 2.Not running tests long enough 3.Not testing during DUT housekeeping periods 4.Leaving verbose DUT logging enabled can consume significant resources

32 31 Test Automation Setups Functional Testing

33 32 Functional Testing What it is –Verify that the DUT can execute all features and functions correctly (positive stimulus/response testing) –Verify that DUT responds properly to negative stimuli Very often ignored, to the detriment of product stability in field How many simultaneous sessions to test? –Depends on device: one or a few as required to verify all features –Quick examples of functional Application servers: Conferencing –Verify all host and listener TUI commands and DUT responses VoIP endpoint devices: Terminal Adapters (TAs) –Verify all call features against softswitch/feature server environments Question: Does verifying voice quality belong in a functional test?

34 33 Functional Testing: A Few Hints Test script synchronization with DUT is key –DTMF or MF handshaking Typically involves “tagging” voice prompts with numeric sequences –Speech recognition –Delays Automation-based functional tests allow: –Much Faster test cycles TA functional test plan comparison: –150 test cases verified against 4 different softswitch platforms –Good idea: functional test suite can be turned into a performance test suite If the tests are designed on a flexible call generator platform Can mix call traffic from functional and load generator platforms

35 34 Test Automation Setups

36 35 Session Border Controller/Firewall Automation Setup Goals Verify call-handling performance and advertised specifications at real-world high density VoIP loads Verify Voice Quality under different Codec, frame packing, and other configuration settings Verify call-handling performance when subjected to different call rate profiles e.g. Burst, ramp, etc. Verify thru-SBC registration performance under burst registration conditions Verify ability to survive and handle legitimate VoIP call loads while under various types of DoS “attacks” Verify long-term call handling reliability

37 36 SBC / Firewall Automation Test Setup Firewall / SBC DUT Network Emulator VoIP Call Generator SIP Attack Generator VoIP Call Generator SIP Proxy Protected Network Unprotected Network + voice quality opt. Registration Generator

38 37 Terminal Adapter (TA) Real-World Network Model VoIP PSTN/Analog

39 38 Automated Feature Test Suite Goals Automate as much of the Terminal Adapter interoperability feature regression test as possible –i.e. Verify call features of TA devices against core VoIP service architectures Support input configuration files, event and error log files Support multiple TA devices and PSTN access lines in setup

40 39 Automation Feature Test Solution Softswitch + Media Server, etc VoIP PSTN/Analog TA/ Voice Gateway Off-net PSTN Caller Access PSTN / PBX LAN/ WAN Gatewa y On-net Subscriber Access TA/ Voice Gateway Router / Edge Device Feature Test Framework TA/ Voice Gateway TDM Call Generator VoIP Call Generator

41 40 Automation Feature Test Framework Details Supports 140+ feature tests –Including 2-way calls, 3-way calls, features including hold/park/transfer, 911/411, voice mail, + voice quality checking Test run results captured in easily analyzed logs Custom reports are generated Individual test case scripts easily changed

42 41 Setting up for IMS Tests Emulation of IMS devices in a QA lab setting will be critical… unless you plan to purchase, support, and maintain a wide variety of third party IMS devices in your lab, a costly and time- consuming proposition. IMS Product Tests IMS Function Tests Application ServerApp Server, OSA- SCS, IM-SSF, SCIM, CCCF Gateway Controller, Call Agents BGCF, MGCF, I-BCF Border Elements, Gateways A-BGF, T-MGF, SGF, IWF, I-BGF Subscriber Databases HSS, SLF Media ServersMRFC, MRFP Policy and Resource Function PDF, NASS, RACS Session ControllersS-CSCF, I-CSCF, P-CSCF

43 42 Setting up for IMS Tests TDM Endpoint Emulation (Vo)IP Endpoint Emulation WAN Emulator Network Analyzer IP Network Mixed Endpoint Emulation SS7/TDM Network VoIP Client IMS Device Emulation TDM User IMS Device Under Test NGN Device

44 43 VoIP Security Testing— Issues to consider

45 44 VoIP Vulnerabilities/Threats The bad news: VoIP systems are vulnerable –Platforms are vulnerable –VoIP-specific attacks are becoming more common The good news: The threat is still developing –VoIP handsets are still in minority “out there” –Vast majority of VoIP is company-internal Courtesy: Mark Collier, CTO SecureLogix VoIP networks share the same vulnerabilities that plague data networks, PLUS some specific additional threats

46 45 VoIP Product Vulnerabilities Network Stack(s) (IP, UDP, TCP, RTP,…) VoIP Protocols Voice Applications Services (Database, Web Server) Telephony Devices Network Devices Servers Physical infrastructure (power, wiring) Toll Fraud SPIT Slammer worm SQL attacks Protocol Attacks SIP Floods RTP Floods OS attacks (Windows worms, Viruses) Physical Hacking Major area of focus at CT Labs SYN Floods, etc. (many…)

47 46 DoS Attack Testing Generate SIP-specific attacks (send “fuzzed” and other types of SIP protocol packet floods) while also sending legitimate SIP calls Measure call performance (dropped, blocked, delayed calls), voice quality with security measures in place Test calls sent with media (real speech) to verify true voice quality via PESQ while under attack BA Reference Prompt Received Prompt RTP Router / switch Device(s) Under Test Router / switch Injected SIP / RTP attacks VoIP Call Generator

48 47 SIP-Specific Attacks to Launch –i.e. in addition to lower-layer well known DoS attacks Blast packets from these scenarios at up to line rates: –Malformed and Torture Test floods Using SIP packets from open source Protos test suite –INVITE, REGISTER, and Response floods –Spoofed variations for above i.e. Spoofing the IP address and port of legitimate devices, or spoofing the Via or AoR of legitimate users –RTP attacks Rogue / Random RTP Fraud and Floods

49 48 SIP-Specific Attacks: What to expect Run each variation for 10-15 minutes –In the presence of varying levels of legitimate VoIP traffic –Monitoring DUT resources (CPU, memory), call completion rates, and voice quality of completed calls It’s typical to see threshold failure effects –i.e. above certain levels of legitimate SIP calls + attack packets, service takes a major hit. Below that threshold normal calls may be handled fine. DUT often shows weakness within seconds of test start DUT may exhibit hard or soft crashes Voice quality may show early warning of catastrophic failure

50 49 Good resources on VoIP Security NIST – National Institute of Standards and Technology –Publication 800-58: “Security Considerations for VoIP Systems” (99 pgs, free) –http://csrc.nist.gov/publications/nistpubs VoIPSA – Voice over IP Security Alliance –Promoting education & awareness, research, testing methodologies & tools –Extensive membership: vendors, VoIP providers, researchers, security vendors, test tool vendors –www.voipsa.org PROTOS group - University of Oulu in Finland –Using protocol fuzzing to discover a wide variety of DoS and buffer overflow vulnerabilities –Have exposed HTTP, LDAP, SNMP, WAP, and VoIP vulnerabilities –www.ee.oulu.fi/research/ouspg/protos/index.htmlwww.ee.oulu.fi/research/ouspg/protos/index.html Mu Security –Manufacturers of a powerful protocol mutation tester (Mu-4000) –www.musecurity.com

51 Feel free to call if you have any questions Chris Bajorek chris@ct-labs.com 916-577-2110 (direct line)


Download ppt "VoIP Testing—A How-to Session for Performance and Functional Test Methodologies Chris Bajorek, Director, CT Labs."

Similar presentations


Ads by Google