Presentation on theme: "Integrated Services High-speed networks have enabled new applications - they also need “deliver on time” assurances from network Applications that are."— Presentation transcript:
Integrated Services High-speed networks have enabled new applications - they also need “deliver on time” assurances from network Applications that are sensitive to timeliness of data are called real-time applications - voice, video, industrial control, stock quotes, … Timeliness guarantees must come from inside the network - end-hosts can not correct late packets like they can correct for lost packets Need more than best-effort - IETF extensions to best-effort model
QoS Controls Admission Control: can requested QoS be met while honoring previously made QoS commitments to already accepted calls? This requires the application to specify its traffic pattern (through a service interface) Traffic Shaping/Policing: make sure the connection obeys its specified traffic once admitted Path Selection: select a path that is likely to satisfy requested QoS Flow Setup: communicate to intermediate routers on selected path the call’s requirements so that they reserve necessary resources (buffers, bandwidth, etc.) Packet Scheduling: manage the packets in the router queue so that they receive the service that has been requested
Traffic Classes Network should match offered service to source requirements Example: telnet requires low bandwidth and low delay - utility (or level of satisfaction) increases with decrease in delay - network should provide a low-delay service - or, telnet belongs to the low-delay traffic class Traffic classes encompass both user requirements and network service offerings
Traffic Classes (cont’d) A basic division: guaranteed (real-time) service and best-effort (elastic) - like flying with reservation or standby Guaranteed service - utility is zero unless application gets a minimum level of service quality (bandwidth, delay, loss) - open-loop flow control with admission control (goal is to lock sender and receiver into a common clocking regime) - e.g. telephony, remote sensing, interactive multiplayer games Best-effort service - send and pray - closed-loop flow control with no admission control (willing to adapt to whatever QoS is available) - e.g. email, net news
IETF IntServ Traffic Classes Based on sensitivity to delay Guaranteed - intolerant: typically non-adaptive, e.g. telephony/interactive voice with fixed playback point - tolerant: typically adaptive, e.g. adaptive playback audio- or video-streaming application minimizing offset delay ==> less delay, but higher loss rate Best-effort - interactive burst (e.g. paging, messaging, email) - interactive bulk (e.g. ftp) - asynchronous bulk (e.g. net news, junk traffic)
IETF GS Subclasses Both subclasses require some bandwidth guarantee Tolerant GS - nominal mean delay, but can tolerate “occasional” variation - not specified what this means exactly! - called predictive or controlled-load service - through admission control, attempts to provide traffic delivery within the same bounds as an unloaded network - it really is this imprecise! Intolerant GS - need a worst-case delay bound - called guaranteed service (real deal!)
Scheduling Algorithms for GS Characterize source by “average” rate and minimal burst size using token bucket -- also called Linear Bounded Arrival Process (LBAP) Conformance if there is always enough tokens whenever a packet is generated. Non-conforming packets are dropped or tagged Use WFQ to reserve bandwidth at average rate Pros: - may use less bandwidth than with peak rate - can get an end-to-end delay guarantee (isolation) Cons: - for low delay bound, may need to reserve at high rate (coupling) - implementation complexity (timestamp calculation and priority queue) - can waste bandwidth; worst-case rarely happens!
Why not use WFQ for Predictive Service? Goal: minimize actual measured delay bounds WFQ provides isolation A burst by one source causes a sharp increase in delay (jitter) seen by that source In FIFO, bursts are multiplexed, thus a burst affects other sources but sees less delay (burst sharing => less jitter) Average delay is same (cf. conservation law), but 99.9 percentile delays are much smaller under FIFO Isolate different classes using WFQ, and use FIFO to benefit from sharing within each class FIFO+ reduces the accumulation of jitter over multiple hops by serving packets in order of expected arrival times under average service
Unified Scheduling WFQ with weights assigned to each guaranteed flow and to the predictive+datagram pseudo flow 0 Multiple priorities within flow 0 FIFO+ within each priority level Priority scheduling shifts jitter from higher priority to lower priority Put datagram traffic at lowest priority level Each priority level has a target delay bound Make jitter shifted from higher priority classes small by choosing target delay bounds widely spaced (at least order of magnitude), thus good isolation
Admission Control For guaranteed service, source asks the network for rate of token bucket, and uses P-G worst-case delay bound For predictive service, source specifies its token bucket parameters, and flow is admitted if - sum of its rate and measured guaranteed+predictive rate < 0.9 capacity (say, 10% for datagram) - for each lower or equal priority, the delay bound is not violated - measured quantities should be conservative to control delay violations Network utilization can be increased in the presence of predictive flows, since measured bounds are smaller
Worst-Case Delay for Predictive Effect of new predictive flow on same priority traffic: - sum of bucket sizes up to that level (including new bucket size) divided by minimum capacity leftover from higher levels ==> delay bound increase depends on new bucket size Effect of new predictive flow on lower priority traffic: - delay bound increase depends on new bucket size and new rate, which decreases leftover capacity ==> does most harm! Effect of guaranteed flow on predictive traffic: - delay bound increase depends on new rate
Approximation Replace worst-case parameters (bucket size and rate) of existing flows by measured delays and usage rates Measure delay for every packet, and update maximum delay every T unless - a new flow is added (a new T is started) - delay of a packet exceeds current value, then back off Measure utilization every S < T, and update maximum utilization every T unless - a new flow is added (a new T is started) - current value is exceeded
Tradeoffs Conservative through delay back off and burst utilization Once estimates are increased, they stay high until T expires Larger T means fewer delay violations and lower network utilization Even worse with shorter flow lifetimes
Traffic Models How users or aggregates of users typically behave - e.g. how long a user uses a modem - e.g. average size of a file transfer Models change with network usage We can only guess about the future Two types of models - measurements - educated guesses
Telephone Traffic Models How are calls placed? - call arrival model - studies show that time between calls is drawn from an exponential distribution - call arrival process is therefore Poisson How long are calls held? - usually modeled as exponential - however, measurement studies show it to be heavy tailed (e.g. Pareto distributed) - means a significant number of calls last a very long time
Internet Traffic Modeling A few applications account for most of the traffic - WWW, FTP, telnet A common approach is to model applications (this ignores distribution of destination!) - time between application invocations - connection duration - number of bytes transferred - packet interarrival distribution Little consensus on models. E.g. some found interarrival times between telnet and FTP sessions to follow exponential distribution, some a generalization of it called Weibull But two important features
Internet Traffic Models: Features LAN connections differ from WAN connections - higher bandwidth (more bytes/call) - longer holding times (free and higher bandwidth!) Many parameters are heavy-tailed - e.g. number of bytes in call, call duration - means a few calls are responsible for most of the traffic - these calls must be well managed - also means that even aggregates with many calls are not smooth (self-similar or LRD traffic) - can have long bursts New models appear all the time, to account for rapidly changing traffic mix
Benefits of Predictive Service Depending on traffic burstiness, utilization gains range from twice to order of magnitude: higher gain with more bursty sources A larger average lifetime to T ratio yields higher utilization but less reliable delay bound LRD traffic can be effectively handled with measurement-based admission control, as long as there is enough room to accommodate bursts (e.g. by lowering utilization target or increase T to reduce delay violations) A side general note on admission control: - flows traversing longer paths have higher chance of being rejected - also, more demanding flows - may want to implement some policy? May implement dynamic estimation or T, to account for burstiness?
Your consent to our cookies if you continue to use this website.