Luis Lopez lulop@kurento.org WebRTC infrastructures in the large (with experiences from real deployments) IIT RTC Conference & Expo October 2015 Luis Lopez.

Name: Luis Lopez lulop@kurento.org WebRTC infrastructures in the large (with experiences from real deployments) IIT RTC Conference & Expo October 2015 Luis Lopez.
Uploaded: 2017-10-14T16:07:22+00:00
Duration: PTM16S57
Channel: Imogen Warren
Description: Luis Lopez lulop@kurento.org WebRTC infrastructures in the large (with experiences from real deployments) IIT RTC Conference & Expo October 2015 Luis Lopez.

Luis Lopez lulop@kurento.org
WebRTC infrastructures in the large (with experiences from real deployments) IIT RTC Conference & Expo October 2015 Luis Lopez

Speaker Coordinator of Kurento.org Software developer Software trainer
FOSS project WebRTC Media Server WebRTC Media APIs WebRTC Cloud Infrastructure Software developer Software trainer Software learner FOSS enthusiast

WebRTC infrastructures
Peer-to-Peer WebRTC Application (without media infrastructure) WebRTC video stream WebRTC Application based on media infrastructure media infrastructure

Function of WebRTC infrastructures
Processing VP8 H.264 Group Communications Archiving

WebRTC infrastructures in the large
From the hundreds to the millions: the scalability problem WebRTC Cloud

WebRTC cloud models SaaS High hourly costs development Simple
No WebRTC-specific players here APIaaS PaaS development Complex Low hourly costs IaaS Computing Resources High flexibility Low flexibility

WebRTC cloud architectures
No new science here SaaS WebRTC Application APIaaS The science for the scalability problem is here WebRTC API PaaS WebRTC Platform IaaS Virtual infrastructure

WebRTC Vs traditional WWW Platforms: the three tiers
Signaling Application Server Container Application 1 … Application N Service Layer WebRTC Media Server DD.BB. Server

Vertical scalability on monolithic WebRTC platforms
Typical scalability curve for SFU media servers Application Server Instance Application 1 … Application N Quality of service Media Server Instance The bottleneck is here Number of WebRTC legs ~500 to 1000 in commodity hardware

Horizontal scalability of WebRTC Media Servers
Load Balancer Application Server Application Server Application Server … Media Resource Broker RFC6917 … Media Server Media Server Media Server Media Server

Media Resource Broker Functions
MS registration MS instances register on the MRB MS brokering Query model AS instances query the MRB for locating a MS instance MRB is explicit for the AS In-line model MRB routes signaling (control requests) MRB is transparent for the AS MRB does not hold state about MS instances MS instances are independent MS instances are equivalent We say it’s stateless

Stateless MRB use cases
Independent MS B2B calls WebRTC GW Room servers Media recording Etc. Application Server Instance Stateless - MRB Media Server Instance Media Server Instance Media Server Instance Media Server Instance Call Call

Deploying in public and private clouds
Amazon Web Services EC2 Most popular public cloud OpenStack Popular public clouds (e.g. RackSpace) Popular for private clouds Deployment Cloud deployment templates CloudFormation (Amazon) Heat (OpenStack)

Templates Declarative language for
Declaration of resources and relationships Images, Computing Nodes, Networks, Volumes, Load Balancers, Autoscaling groups, etc. Deployment Instantiation of resources Runtime Provisioning Autoscaling

Deploying in public clouds
Source code Launch configurations Autoscaling Rules Stack definition template Chef + Packer CloudFormation / Heat AWS EC2 / OpenStack Nova Autoscaling Group Autoscaling Group AWS AMI / OpenStack Glance Media Server Instance Application Server Instance Media Server Image Elastic Load Balancer Broker Instance Media Server Instance Application Server Image Broker Image Application Server Instance Media Server Instance

Experiences deploying large WebRTC infrastructures in public clouds
Lessons learnt: fault-resilience is hard AS & MRB layers Are stateless => use distributed cache systems MS layer Is stateful => lots of problems Application Server Application Server … Media Resource Broker … Media Server Media Server Media Server Media Server

Lessons learnt: avoid single points of failure
The wrong way (single point of failure) The right way (fault-tolerant MRB) Elastic Load Balancer MRB Computing Node MRB MRB distributed cache MS … MS MS … MS Computing Node Computing Node Computing Node Computing Node

Lessons learnt: fault-recovery at the MS layer
Fault-tolerance on the MS layer Stateful problem MS instances hold specific resources that cannot be “serialized” to a distributed cache: Specific Sockets Machine failure => session failure Our proposed solution Re construct the session Detect failure Notify failure Reconnect Failure notification Session reconnection Application Server Instance Failure detection MRB Media Server Instance Media Server Instance Media Server Instance Media Server Instance Call Call

Autoscaling

Lessons learnt: lack of optimal scale-out events and metrics
Lessons learnt: firing scale-out events which metric? Bottleneck depends on applications: network, CPU, memory, etc. our recommendation: define a synthetic metric (i.e. scaling points) and be conservative Typical scalability curve for SFU media servers Quality of service 50% CPU load 40% Memory Number of WebRTC legs

Lessons learnt: scaling-in is harder than scaling-out
The options (none-good) Expose # sessions as a metric Depends on cloud capabilities AS needs to be made cloud aware Session migration Renegotiations Retain period Sub-optimal utilization The simplest Application Server Instance MRB MS1 MS2 MS3 MS4 Which one would you remove?

Limits of the (stateless) MRB
One to MANY Media stream

Stateful MRB Stateful MRB Application Server Instance Media Media

Stateful because … MRB Must be aware of media topology
Stateful information about MS relationships Request routing depends on topology Where to place a new viewer? Request routing depends on internal state CPU load QoS Memory Etc.

Experiences with stateful MRB in AWS EC2 & OpenStack
Lessons learned: beware of WebRTC internals Differentiated quality SVC is the solution but its not ready Plain SFU forwarding models are not an option. RTCP feedback of viewers with bad connectivity destroy QoE Simulcast may be an option Suppress feedback of viewers with really bad connectivity Layered transcoding works nicely But its expensive Churn and the generation of key-frames Periodic key-frame generation is an option In VP8 expect significant increase in BW consumption But its again expensive

Experiences with stateful MRB in AWS EC2 & OpenStack
Lessons learned: the cloud is evil Placement of incoming WebRTC legs New science required here Ideas? Our solutions Count number of WebRTC legs (points mechanisms9 Ad-hoc, hard and error prone Fault-resilience Our solution Re-construct internal parts of the tree, but never leaves. Requires client renegotiation

Thanks Luis Lopez

Luis Lopez lulop@kurento.org WebRTC infrastructures in the large (with experiences from real deployments) IIT RTC Conference & Expo October 2015 Luis Lopez.

Similar presentations

Presentation on theme: "Luis Lopez lulop@kurento.org WebRTC infrastructures in the large (with experiences from real deployments) IIT RTC Conference & Expo October 2015 Luis Lopez."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Luis Lopez lulop@kurento.org WebRTC infrastructures in the large (with experiences from real deployments) IIT RTC Conference & Expo October 2015 Luis Lopez.

Similar presentations

Presentation on theme: "Luis Lopez lulop@kurento.org WebRTC infrastructures in the large (with experiences from real deployments) IIT RTC Conference & Expo October 2015 Luis Lopez."— Presentation transcript:

Similar presentations

About project

Feedback