4. Behind Traditional TCP: protocols for high-throughput and high-latency networks PA159: Net-Centric Computing I. Eva Hladká Slides by: Petr Holub, Tomáš Rebok Faculty of Informatics Masaryk University Autumn 2011 Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 1/66 Lecture overview Q) Traditional TCP and its issues Ql Improving the traditional TCP • Multi-stream TCP • WeblOO Ql Conservative Extensions to TCP « GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Q TCP Extensions with IP Support « QuickStart, E-TCP, FAST Q Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP Ql Conclusions Literature Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 Lecture overview Q) Traditional TCP and its issues Q Improving the traditional TCP • Multi-stream TCP • WeblOO Q Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP 01 TCP Extensions with IP Support • QuickStart, E-TCP, FAST 01 Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP 01 Conclusions Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 Protocols for reliable data transmission Protocols for reliable data transmission have to: • ensure the reliability of the transfer • retransmissions of lost packets • FEC might be usefully employed • a protection from congestion a network, receiver Behavior evaluation: • aggressiveness - ability to utilise available bandwidth • responsiveness - ability to recover from a packet loss • fairness - getting a fair portion of network throughput when more streams/participants use the network Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 4/66 Problem statement • network links with high capacity and high latency • iGrid 2005: San Diego o Brno, RTT = 205 ms • SC|05: Seattle o Brno, RTT = 174ms • traditional TCP is not suitable for such an environment: • lOGb/s, RTT = 100 ms, 1500B MTU =4> sending/outstanding window 83.333 packets =4> a single packet may be lost in at most 1:36 hour O terribly slow Q if errors are more frequent, the maximum throughput cannot be reached • How could be a better network utilization achieved? • How could be a reasonable co-existence with traditional TCP ensured? • How could be a gradual deployment of a new protocol ensured? Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 5/66 Traditional TCP I. flow control vs. congestion control no control: sender network receiver ^ flow control (rwnd) --► sender network receiver ^ congestion control (cwnd): sender network receiver Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 6/66 Traditional TCP Traditional TCP I. Traditional TCP II. • Flow control • an explicit feedback from receiver(s) using rwnd • deterministic • Congestion control • an approximate sender's estimation of available throughput (using cwnd) • the final window used: ownd ownd = m\n{rwnd, cwnd} The bandwidth bw could be computed as: 8 • MSS • ownd Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 8/66 Traditional TCP II. Flow Control Packet Sent Packet Received acknowledged to be sent outside window Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 9/66 Traditional TCP - Tahoe and Reno Congestion control: • traditionally based on AIMD - Additive Increase Multiplicative Decrease approach Tahoe [1] • cwnd = cwnd + 1 . . . per RTT without loss (above sstresh) • sstresh = 0, 5cwnd cwnd = 1 . . . per every loss Reno [2] adds • fast retransmission » a TCP receiver sends an immediate duplicate ACK when an out-of-order segment arrives • all segments after the dropped one trigger duplicate ACKs • a loss is indicated by 3 duplicate ACKs (« four successive identical ACKs without intervening packets) • once received, TCP performs a fast retransmission without waiting for the retransmission timer to expire • fast recovery - slow-start phase not used any more sstresh = cwnd = 0, 5cwnd Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 10 / 66 Traditional TCP - Tahoe I. Connection opening : cwnd = 1 segment i Slow Start Exponential increase for cwnd until cwnd = SSTHRESH Retransmission timeout SSTHRESH:=cwnd/2 cwnd = SSTHRESH Retransmission timeout SSTHRESH:=cwnd/2 cwnd:= 1 segment Congestion Avoidance Additive increase for cwnd •Exponential increase for cwnd: for every useful acknowledgment received, cwnd := cwnd + (1 segment size) •Additive increase for cwnd : for every useful acknowledgment received, cwnd := cwnd + (segment size)*(segment size) / cwnd it takes a full window to increment the window size by one. Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 11 / 66 Traditional TCP - Tahoe II. 30 25 20 in 23 15 10 5 0 A A f I 1 p / 10 20 30 time [RTT] -• cwnd 40 50 Eva Hladká (Fl MU) 4. Behind Traditional TCP . sstresh Autumn 2011 12 / 66 Traditional TCP - Reno I. Connection opening : cwnd - 1 segment - Slow Start Exponential increase for cwnd until cwnd - SSTHRESH L Retransmission timeout SSTEiRESH:-cwnd/2 Retransmission timeout SSTTTRESH:=cwnd/2 cwnd - SSTHRESH Retransmission timeout SSTHRESH:=cwnd/2 cwnd" 1 segment Congestion Avoidance Additive increase for cwnd 3 duplicate ack received 3 duplicate ack received Fast Recovery Exponential increase beyond cwnd Expected ack received cwnd:=cwnd/2 Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 13 / 66 Traditional TCP - Reno II. 0 10 20 30 40 50 time [RTT] ° o cwnd .......... sstresh Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 14 / 66 TCP Vegas • Vegas—a concept of congestion control [3] • when a network is congested, the RTT becomes higher • RTT is monitored during the transmission • when a RTT increase is detected, the congestion's window size is linearly reduced • a possibility to measure an available network bandwidth using inter-packet spacing/dispersion Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 15 / 66 Traditional TCP • a reaction to packet loss—retransmission • Tahoe: the whole actual window ownd • Reno: a single segment in the Fast Retransmission mode • NewReno: more segments in the Fast Retransmission mode • Selective Acknowledgement (SACK): just the lost packets • fundamental question: How could be a sufficient size of cwnd (under real conditions) achieved in the network having high capacity and high RTT? . . .without affecting/disallowing the "common" users from using the network? Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 16 / 66 Traditional TCP - Response Function • Response Function represents a relation between bw and a steady-state packet loss rate p • cwndaverage « ^= (for MSS-sized segments) • using (1): bWJ^^p • the responsiveness of traditional TCP • assuming, that the packet has been lost when cwnd — bw ■ RTT _ bw RTT2 6~ 2MSS Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 17 / 66 Traditional TCP - Responsiveness TCP responsiveness -C= 622Mbit/s -C= 2.5 Gbit/s -C= 10 Gbit/s 200 Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 18 / 66 Traditional TCP - Fairness I. • a fairness in a point of equilibrium • the fairness is considered for • streams with different RTT • streams with different MTU • The speed of convergence to the point of equilibrium DOES matter! Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 19 / 66 Traditional TCP - Fairness II. • cwnd + = MSS, cwnd*= 0,5 (30 steps) 100 ■ Eva Hladká (Fl MU) 20 40 60 80 100 4. Behind Traditional TCP . Autumn 2011 20 / 66 Improving the traditional TCP Multi-stream TCP Lecture overview Q Traditional TCP and its issues O Improving the traditional TCP • Multi-stream TCP • WeblOO Q Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Q TCP Extensions with IP Support • QuickStart, E-TCP, FAST Q Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP Q Conclusions Improving the traditional TCP Multi-stream TCP Multi-stream TCP • assumes multiple TCP streams transferring a single data flow • in fact, improves the TCP's performace/behavior just in cases of isolated packet losses • a loss of more packets usually affects more TCP streams • usually available because of a simple implementation • bbftp, GridFTP, Internet Backplane Protocol, .. . • drawbacks: • more complicated than traditional TCP (more threads are necessary) • the startup is accelerated linearly only • leads to a synchronous overloading of queues and caches in the routers Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 23 / 66 Improving the traditional TCP Multi-stream TCP TCP implementation tuning I. cooperation with HW • Rx/Tx TCP Checksum Offloading • ordinarily available zero copy » accessing the network usually leads to several data copies: user-land -<->• kernel -<->• network card • page flipping - user-land -<->• kernel data movement • support for, e.g., sendfileO • implementations for Linux, FreeBSD, Solaris, . .. Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 24 / 66 Improving the traditional TCP WeblOO TCP implementation tuning II. 9 WeblOO [4, 5] • a software that implements instruments in the Linux TCP/IP stack -TCP Kernel Instrumentation Set (TCP-KIS) • more than 125 "puls/rods" • information available via /proc • distributed in two pieces: • a kernel patch adding the instruments • a suite of "userland" libraries and tools for accessing the kernel instrumentation (command-line, GUI) • the WeblOO software allows: • monitoring (extended statistics) • instruments' tuning » support for auto-tuning Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 25 / 66 Conservative Extensions to TCP GridDT Lecture overview 01 Traditional TCP and its issues 01 Improving the traditional TCP • Multi-stream TCP • WeblOO 01 Conservative Extensions to TCP « GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP 01 TCP Extensions with IP Support • QuickStart, E-TCP, FAST 01 Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP 01 Conclusions Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 Conservative Extensions to TCP GridDT GridDT • a collection of ad-hoc modifications :( • correction of sstresh • faster slowstart • AIMD's modification for congestion control: • cwnd — cwnd + a . .. per RTT without packet loss • cwnd — b cwnd . .. per packet loss • just the sender's side has to be modified Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 27 / 66 Conservative Extensions to TCP GridDT GridDT - example I Host#1 I "°s'»2 HtSf GbE Switch K5> POS2.5 Gb/s Starlight (CHI} I Hostili ] (R)—! HQ8t#2~l TCP Reno performance (see slide #8): First stream CVA <-> Sunnyvale : RTT =181 ms ; Avg. throughput over a period of 7000s = 202 Mb/s Second stream CVA<->CHI : RTT = 1 1 7 ms; Avg. throughput over a period of 7000s = 51 4 Mb/s Links utilization 71,6% Grid DT tuning in order to improve fairness between two TCP streams with different RTT: First stream CVA <-> Sunnyvale : RTT =181 ms, Additive increment = A = 7 ; Average throughput = 330 Mb/s Second stream CVA<->CHI : RTT =117 ms, Additive increment = B = 3 ; Average throughput = 388 Mb/s Links utilization 71.8% Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 29 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP • proposed by Tom Kelly [1] • congestion control is not AIMD any more: • cwnd — cwnd + 0, 01 cwnd . .. per RTT without packet loss cwnd — cwnd + 0, 01 ... per ACK • cwnd — 0, 875 cwnd . .. per packet loss Multiplicative Increase Multiplicative Decrease (MIMD) • for smaller window size and/or higher loss rate in the network the Scalable-TCP switches into AIMD mode calable TCP Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 30 / 66 84 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Scalable TCP TimelRTT) Time (RTT) Figure: Packet loss recovery times for the traditional TCP (left) are proportional to cwnd and RTT. A Scalable TCP connection (right) has packet loss recovery times that are proportional to connection's RTT only. (Note: link capacity c < C) Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 31 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Scalable TCP - fairness I. Two concurrent Scalable TCP streams, Scalable control switched on when >30Mb/s, twiced number of steps in comparison with previous simulations Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 32 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Scalable TCP - fairness II. Scalable TCP and traditional TCP streams, Scalable control switched on when >30Mb/s, twiced number of steps Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 33 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Scalable TCP - Response curve ť/j N vi % o 73 1000 100 1 0.0001 Standard TCP _i_i_i_■ ' ' ■ i_i_i_i_i_' ' ■ ' 0.001 0.01 Loss rate 0.1 Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 34 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP High-Speed TCP (HSTCP) • Sally Floyd, RFC3649, [2] • congestion control AIMD/MIMD: • cwnd — cwnd + a(cwnd) . .. per RTT without loss i i . a(cwnd) cwnd — cwnd H—i—-r1 cwnd ... per ACK • cwnd — b(cwnd) cwnd . .. per packet loss • emulates the behavior of traditional TCP for small window sizes and/or higher packet loss rates in the network Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 35 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP High-Speed TCP (HSTCP) • proposed MIMD parametrization: -0,4(log(cuw«/)-3,64) b(cwnd) = -1- cn--1-0,5 a(cwnc/) 7,69 2cwnd2 b(cwnd) 12,8(2 - ^(a/i/rtc/))!/!/1'2 Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 36 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP High-Speed TCP (HSTCP) • a parametrization equivalent to the Scalable-TCP is possible: =>- Linear HSTCP • a comparison with the Multi-stream TCP N(cwnd) « 0, 23cwnd°A • N(cwnd) - the number of parallel TCP connections emulated by the HighSpeed TCP response function with congestion window cwnd Neither Scalable TCP nor HSTCP (sophistically) deal with the slow-start phase. Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 37 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP HSTCP - Response curve 100000 10000 1000 1 III : - : : Regular TCP - Highspeed TCP - Scalable ill TCP - i i i i le-10 le-89 le-0B le-07 le-96 le-05 0.0001 0.001 0.01 0.1 Eva Hladká (Fl MU) 4. Behind Traditional Ti Autumn 2011 38 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP H-TCP I. • created by researchers at the Hamilton Institute in Ireland • a simple change to cwnd increase function • increases its aggressiveness (in particular, the rate of additive increase) as the time since the previous loss (backoff) increases • increase rate a is a function of the elapsed time since the last backoff • the AIMD mechanism is used • preserves many of the key properties of standard TCP: fairness, responsiveness, relationship to buffering 1 Eva Hladká (Fl MU) Autumn 2011 39 / 66 H-TCP II. Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP • A . .. time elapsed from last congestion experienced • Ai ... for A < Ai a TCP's grow is used • Aß ... the bandwidth threshold, above which the TCP fall is used (for significant bandwidth changes the 0.5 fall is used) • Tmin, Tmax . .. the minimal resp. maximal RTTs measured • B(k) ... maximum throughput measurement for the last interval without packet loss Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 40 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP H-TCP III. • cwnd = cwnd + 2(1~/3) j(A) cwnd ... per ACK a cwnd = b(B) cwnd . .. per loss 3(A) b(B) a'(A) 1 A < AL max{a'{A)Tmin;l} A > AL 0,5 B{k+l)-B{k) B(k) > Af mm{7™T; °' 8i in the other case = 1 + 10(A- AL) + 0,5(A- At)2 .quadratic increment function Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 41 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP BIC-TCP • the default algorithm in Linux kernels (2.6.8 and above) • uses binary-search algorithm for cwnd update [3] • 4 phases: (1) a reaction to a packet loss (2) additive increase (3) binary search (4) maximum probing additive increase binary search maximum probing y Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 42 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP BIC-TCP (1) Packet loss • BIC-TCP starts from the TCP slow start a when a loss is detected, it uses multiplicative decrease (as standard TCP) and sets the windows just before and after loss event as: • previous window size —> Wmax (the size of cwnd before the loss) • reduced window size —> Wmjn (the size of cwnd after the loss) • =4> because the loss occured when cwnd < Wmax, the point of equilibrium of cwnd will be searched in the range (Wmm; Wmax) (2) Additive increase • starting the search from cwnd — w™"+w™x might be too challenging for the network • thus, when w™"+w™* > Wmm + Smax, the additive increase takes place -)• cwnd = Wmin + Smax o the window linearly increases by Smax every RTT Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 43 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP BIC-TCP (3) Binary search • once the target (cwnd — w™"+w™*) is reached, the Wmm — cwnd • otherwise (a packet loss happened) Wmax = cwnd • and the searching continues to the new target (using the additive increase, if necessary) until the change of cwnd is less than the Sm,nconstant • here, cwnd = Wmax is set The points (2) and (3) lead to linear (additive) increase, which turns into logarithmic one (binary search). Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 44 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP BIC-TCP (4) Maximum probing • inverse process to points (3) and (2) • first, the inverse binary search takes place (until the cwnd growth is greater than Smax) • once the cwnd growth is greater than Smax, the linear growth (by a reasonably large fixed increment) takes place • first exponencial growth, then linear growth Assumed benefits: • traditional TCP "friendliness" • during the "plateau" (3), the TCP flows are able to grow » AIMD behavior (even though faster) during (2) and (4) phases • more stable window size =4> better network utilization • most of the time, the BIC-TCP should spend in the "plateau" (3) Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 45 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP CUBIC-TCP • even though being pretty good scalable, fair, and stable, BIC's growth function is considered to be still aggressive for TCP • especially under short RTTs or low speed networks • CUBIC-TCP • a new release of BIC, which uses a cubic function • for the purpose of simplicity in protocol analysis, the number of phases was further reduced Wcubic = C{T - Kf + Wmax where C is a scaling constant, T is the time elapsed since last loss event, Wm3x is the window size before loss event, K — w™xP~t and /3 is a constant decrease factor CWND Steady State Behavior Probing Wmax Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 46 / 66 TCP Extensions with IP Support QuickStart, E-TCP, FAST Lecture overview 01 Traditional TCP and its issues 01 Improving the traditional TCP • Multi-stream TCP • WeblOO 01 Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP 01 TCP Extensions with IP Support « QuickStart, E-TCP, FAST 01 Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP 01 Conclusions Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 TCP Extensions with IP Support QuickStart, E-TCP, FAST Quickstart (QS)/Limited Slowstart I. • there is a strong assumption, that the slow-start phase cannot be improved without an interaction with lower network layers • a proposal: 4-byte option in IP header, which comprises of QS TTL and Initial Rate fields • sender, which wants to use the QS, sets the QS TTL to an arbitrary (but high enough) value and the Initial Rate to requested rate, which it wants to start the sending at, and sends the SYN packet Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 48 / 66 TCP Extensions with IP Support QuickStart, E-TCP, FAST Quickstart (QS)/Limited Slowstart II. • each router on the path, which support the QS, decreases the QS TTL by one and decreases the Initial Rate, if necessary • receiver sends the QS TTL and Initial Rate in the SYN/ACK packet to the sender a sender knows, whether all the routers on the path support the QS (by comparing the QS TTL and the TTL) • sender sets the appropriate cwnd and starts using its congestion control mechanism (e.g., AIMD) • Requires changes in the IP layer! :-( Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 49 / 66 E-TCP I. TCP Extensions with IP Support QuickStart, E-TCP, FAST • Early Congestion Notification (ECN) • a component of Advanced Queue Management (AQM) • a bit, which is set by routers when a congestion of link/buffer/queue is coming • ECN flag has to be mirrored by the receiver • the TCP should react to the ECN bit being set in the same way as to a packet loss • requires the routers' administrators to configure the AQM/ECN :-( Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 50 / 66 E-TCP II. TCP Extensions with IP Support QuickStart, E-TCP, FAST • E-TCP • proposes to mirror the ECN bit just once (for the first time only) • freezes the cwnd when an ACK having ECN-bit set is received from the receiver • requires introducing of small (synthetic) losses to the network in order to perform multiplicative decrease because of fairness • requires a change in receivers' behavior to ECN bit :-( Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 51 / 66 TCP Extensions with IP Support 3uickStart, E-TCP, FAST FAST • Fast AQM Scalable TCP (FAST) [5] • uses end-to-end delay, ECN and packet losses for congestion detection/avoida nee • if too few packets are queued in the routers (detected by RTT monitoring), the sending rate is increased • differences from the TCP Vegas: • TCP Vegas makes fixed size adjustments to the rate, independent of how far the current rate is from the target rate • FAST TCP makes larger steps when the system is further from equilibrium and smaller steps near equilibrium • if the ECN is available in the network, FAST TCP can be extended to use ECN marking to replace/supplement queueing delay and packet loss as the congestion measure Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 52 / 66 Approaches Different from TCP Lecture overview 01 Traditional TCP and its issues 01 Improving the traditional TCP • Multi-stream TCP • WeblOO 01 Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP 01 TCP Extensions with IP Support • QuickStart, E-TCP, FAST 01 Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP 01 Conclusions Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 Approaches Different from TCP tsunami tsunami • TCP connection for out-of-band control channel • connection parameters negitiation • requirements for retransmissions - uses NACKs instead of ACKs • connection termination negotiation • UDP channel for data transmission • MIMD congestion control • highly configurable/customizable » MIMD parameters, losses threshold, maximum size of the queue for retransmissions, the interval of sending the retransmissions' requests, etc. Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 54 / 66 Approaches Different from TCP RBUDP Reliable Blast UDP - RBUDP • similar to tsunami - out-of-band TCP channel for control, UDP for data transmission • proposed for disk-to-disk transmissions, resp. the transmissions where the complete transmitted data could be saved in the sender's memory • sends data in a user-defined rate • app_perf (a clon of iperf) is used for an estimation of networks'/receivers' capacity Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 55 / 66 Approaches Different from TCP RBUDP able Blast UDP - RBUDP ~> UDP data traffic ~* TCP signaling traffic Figure 1. The Time Sequence Diagram of RBUDP Source: E. He, J. Leigh, O. Yu, T. A. DeFanti, "Reliable Blast UDP: Predictable High Performance Bulk Data Transfer," IEEE Cluster Computing 2002. Chicago, Illinois, Sept, 2002 A start of the transmission (using pre-defined rate) B end of the transmission C sending the DONE signal via the control channel; the receiver responses with a mask of data, that had arrived D re-sending of missing data E-F-G end of transmission The steps C and D repeat until all the data are delivered. Approaches Different from TCP XCP eXplicit Control Protocol - XPC • uses a feedback from routers per paket Round Trip Time ConJ ) Congestion Window Feedback = + 0.1 packet Congestion Header Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 57 / 66 Approaches Different from TCP XCP eXplicit Control Protocol - XPC Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 57 / 66 Approaches Different from TCP XCP eXplicit Control Protocol - XPC uses a feedback from routers per paket Congestion Window = Congestion Window + Feedback Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 57 / 66 Approaches Different from TCP SCTP, DCCP, STP, Reliable UDP, XTP Different approaches I. • SCTP • multi-stream, multi-homed transport (end node might have several IP addresses) • message-oriented like UDP, ensures reliable, in-sequence transport of messages with congestion control like TCP • http://www.sctp.org/ • DCCP • non-reliable protocol (UDP) with a congestion control compatible with the TCP • http://www.ietf.org/html.charters/deep-charter.html • http://www.icir.org/kohler/dcp/ Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 58 / 66 Approaches Different from TCP SCTP, DCCP, STP, Reliable UDP, XTP Different approaches II. • STP • based on CTS/RTS » a simple protocol designed for a simple implementation in HW • without any sophisticated congestion control mechanism • http://lwn.net/2001/features/OLS/pdf/pdf/stlinux.pdf • Reliable UDP • ensures reliable and in-order delivery (up to the maximum number of retransmissions) • RFC908 a RFC1151 • originally proposed for IP telephony • connection parameters can be set per-connection • http://www.javvin.com/protocolRUDP.html • XTP (Xpress Transfer Protocol), ... Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 59 / 66 Lecture overview 01 Traditional TCP and its issues 01 Improving the traditional TCP • Multi-stream TCP • WeblOO 01 Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP 01 TCP Extensions with IP Support • QuickStart, E-TCP, FAST 01 Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP 01 Conclusions Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 60 / 66 Conclusions I. • Current state: • multi-stream TCP is intensively used (e.g., Grid applications) • looking for a way which will allow safe (i.e., backward compatible) development/deployment of post-TCP protocols • aggressive protocols are used on private/dedicated networks/circuits (e.g., A-networks CzechLight/CESNET2, SurfNet, CaNET*4, . ..) • implementation SCTP under FreeBSD 7.0 • implementation DCCP under Linux Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 61 / 66 Conclusions II. • interaction with L3 (IP) • interaction with data link layer • variable delay and throughput in wireless networks • optical burst switching • specific per-flow states in routers: • e.g., per-flow setting for packet loss generation (—>• E-TCP) • may help short-term flows with high capacity demands (macro-bursts) • problem with scalability and cost :-( Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 62 / 66 Lecture overview 0 Traditional TCP and its issues 01 Improving the traditional TCP • Multi-stream TCP • WeblOO 0 Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP 0 TCP Extensions with IP Support • QuickStart, E-TCP, FAST 0 Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP 0 Conclusions 0 Literature Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2011 63 / 66 Literature Jacobson V. "Congestion Avoidance and Control", Proceedings of ACM SIGCOMM'88 (Standford, CA, Aug. 1988), pp. 314-329. ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z Allman M., Paxson V., Stevens W. "TCP Congestion Control", RFC2581, Apr. 1999. http://www.rfc-editor.org/rfc/rfc2581.txt Brakmo L, Peterson L. "TCP Vegas: End to End Congestion Avoidance on a Global Internet", IEEE Journal of Selected Areas in Communication, Vol. 13, No. 8, pp. 1465—1480, Oct. 1995. ftp://ftp.cs.arizona.edu/xkernel/Papers/jsac.ps http://www.weblOO.org Hacker T. J., Athey B. D., Sommerfield J. "Experiences Using WeblOO for End-To-End Network Performance Tuning" http://www.weblOO.org/docs/ExperiencesUsingWeblOOforHostTuning.pdf Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 64 / 66 Literature Kelly T. "Scalable TCP: Improving Performance in Highspeed Wide Area Networks", PFLDnet2003, http://datatag.web.cern.ch/datatag/pfldnet2003/papers/kelly.pdf, http://wwwlce.eng.cam.ac.uk/~ctk21/scalable/ Floyd S. "HighSpeed TCP for Large Congestion Windows", 2003, http : //www .potaroo .net/ietf/all-ids/draf t-f loyd-tcp-highspeed-03 .txt BIC-TCP, http: //www. esc .ncsu. edu/facuity/rhee/export/bitcp/ Floyd S., Allman M., Jain A., Sarolahti P. "Quick-Start for TCP and IP", 2006, http : //www. ietf . org/internet-draf ts/draf t-ietf-tsvwg-quickstart-02 . txt Jin C, Wei D., Low S. H., Buhrmaster G., Bunn J., Choe D. H., Cottrell R. L. A., Doyle J. C, Newman H., Paganini F., Ravot S., Singh S. "FAST - Fast AQM Scalable TCP." http://netlab.caltech.edu/FAST/ http://netlab.caltech.edu/pub/papers/FAST-infocom2004.pdf tsunami, http : //www. anml. iu. edu/anmlresearch.html Zdroj: E. He, J. Leigh, O. Yu, T. A. DeFanti, "Reliable Blast UDP: Predictable High Performance Bulk Data Transfer," IEEE Cluster Computing 2002, Chicago, Illinois, Sept, 2002. Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 65 / 66 Further materials 0 Workshops PFLDnet 2003-2010 • http: //datatag.web.cern.ch/datatag/pfldnet2003/program.html • http://www-didc.lbl.gov/PFLDnet2004/ • http://www.ens-lyon.fr/LIP/RESO/pfldnet2005/ • http://www.hpcc.jp/pfldnet2006/ • http://wil.cs.caltech.edu/pfldnet2007/ • prof. Sally Floyd's pages: • http://www.icir.org/floyd/papers.html • RFC3426 - "General Architectural and Policy Considerations" http://www.hamilton.ie/net/eval/results_HI2005.pdf Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2011 66 / 66