4. Behind Traditional TCP: protocols for high-throughput and high-latency networks PA191: Advanced Computer Networking Eva Hladká Slides by: Petr Holub, Tomáš Rebok Faculty of Informatics Masaryk University Autumn 2016 Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 1/66 Lecture overview Q Traditional TCP and its issues Q Improving the traditional TCP • Multi-stream TCP • WeblOO Q Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Q TCP Extensions with IP Support • QuickStart, E-TCP, FAST Q Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP O Conclusions Q Literature Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 Lecture overview Q Traditional TCP and its issues Q Improving the traditional TCP • Multi-stream TCP • WeblOO Q Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Q TCP Extensions with IP Support • QuickStart, E-TCP, FAST Q Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP O Conclusions A Literature Protocols for reliable data transmission Protocols for reliable data transmission have to: • ensure the reliability of the transfer • retransmissions of lost packets • FEC might be usefully employed • a protection from congestion • network, receiver Behavior evaluation: • aggressiveness - ability to utilise available bandwidth • responsiveness - ability to recover from a packet loss • fairness - getting a fair portion of network throughput when more streams/participants use the network Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 Problem statement network links with high capacity and high latency • iGrid 2005: San Diego Brno, RTT = 205 ms • SC|05: Seattle Brno, RTT = 174 ms traditional TCP is not suitable for such an environment: • lOGb/s, RTT = 100 ms, 1500B MTU =^> sending/outstanding window 83.333 packets =^> a single packet may be lost in at most 1:36 hour O terribly slow O if errors are more frequent, the maximum throughput cannot be reached • How could be a better network utilization achieved? • How could be a reasonable co-existence with traditional TCP ensured? • How could be a gradual deployment of a new protocol ensured? Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 5 / 66 Traditional TCP • flow control vs. congestion control no control sender network receiver flow control (rwnd) --> sender network receiver congestion control (cwnd) <---> v J sender receiver network Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 6 / 66 Traditional "CP I. Trans mis sian rate adjustment Transmission network Flow control is for receivers Congestion control is for the network Internal congestion Small-capacity received Large-capacity receiver Congestion collapse was first observed in 19K6 bv V.Jacobson. Congestion contra] was added to TCP (TCP Tahoe)in 1988. From Computer Networks, A. Tanenbaum Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 7 / 66 Traditional TCP Traditional TCP I. • Flow control • an explicit feedback from receiver(s) using rwnd • deterministic • Congestion control • an approximate sender's estimation of available throughput (using cwnd) 9 the final window used: ownd ownd = min{/wncf, cwnd} The bandwidth bw could be computed as: bw 8 • MSS • ownd (1) RTT Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 8 / 66 Traditional TCP Traditional TCP II Flow Control Packet Sent Source Port Dest. Port Sequence Number Acknowledgment HL/Flags Windov D. Checksum Urgent Foi Packet Received Source Port Dest. Port Sequence Number Ackn owled g ment HL/Fla ___ Window EE / Options.. acknowledged to be sent outside window Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 9 / Traditional "CP - fahoe and Reno Congestion control: • traditionally based on AIMD - Additive Increase Multiplicative Decrease approach Tahoe [1] • cwnd = cwnd + 1 ... per RTT without loss (above sstresh) • sstresh = 0, 5cwnd cwnd = 1 ... per every loss Reno [2] adds • fast retransmission • a TCP receiver sends an immediate duplicate ACK when an out-of-order segment arrives • all segments after the dropped one trigger duplicate ACKs • a loss is indicated by 3 duplicate ACKs (~ four successive identical ACKs without intervening packets) o once received, TCP performs a fast retransmission without waiting for the retransmission timer to expire • fast recovery - slow-start phase not used any more sstresh = cwnd = 0, 5cwnd Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 10 / Traditional TCP Connection opening : cwnd = 1 segment Slow Start Congestion Avoidance Exponential increase for cwnd until cwnd SSTHRESH Additive increase for cwnd cwnd = SSTHRESH Retransmission timeout S STHRESH:- c wnd/2 cwnd:= 1 segment Retransmission timeout SSTHRESH:=cwnd/2 'Exponential increase for cwnd: for every useful acknowledgment received, cwnd := cwnd + (1 segment size) •Additive increase for cwnd: for every useful acknowledgment received, cwnd := cwnd + (segment size)*(segment size) / cwnd it takes a full window to increment the window size by one. Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 11 / 66 Traditional TCP Traditional TCP - Tahoe II. 0 10 20 30 40 50 time [RTT] o cwnd .......... sstresh Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 Traditional TCP Connection opening : cwnd = 1 segment Slow Start Exponential increase for cwnd until cwnd = SSTHRESH Retransmission timeout SSTEIRESH:=cwnd/2 Retransmission timeout SSTHRESH:=cwnd/2 cwnd = SSTHRESH Retransmission timeout SSTHRESH:=cwnd/2 cwnd:= 1 segment Congestion Avoidance Additive increase for cwnd 3 duplicate ack received 3 duplicate ack received Fast Recovery Exponential increase beyond cwnd Expected ack received cwnd:=cwnd/2 Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 13 / 66 Traditional TCP Traditional TCP - Reno II. 0 10 20 30 40 50 time [RTT] o cwnd .......... sstresh Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 TCP Vegas Traditional TCP Vegas—a concept of congestion control [3] • when a network is congested, the RTT becomes higher • RTT is monitored during the transmission • when a RTT increase is detected, the congestion's window size is linearly reduced a possibility to measure an available network bandwidth using inter-packet spacing/dispersion Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 15 / 66 Traditional TCP a reaction to packet loss—retransmission • Tahoe: the whole actual window ownd • Reno: a single segment in the Fast Retransmission mode • NewReno: more segments in the Fast Retransmission mode • Selective Acknowledgement (SACK): just the lost packets fundamental question: How could be a sufficient size of cwnd (under real conditions) achieved in the network having high capacity and high RTT? .. .without affecting/disallowing the "common" users from using the network? Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 16 / 66 Traditional TCP Traditional TCP - Response Function Response Function represents a relation between bw and a steady-state packet loss rate p • cwndaverage ^ (for MSS-sized segments) . using (1): bw » the responsiveness of traditional TCP » assuming, that the packet has been lost when cwnd = bw ■ RTT _ bw RTT2 6 ~ 2MSS Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 17 / 66 Traditional TCP - Responsiveness TCP responsiveness 18000 C= 622Mbit/s C=2.5Gbit/s C=10Gbit/s 50 100 RTT (ms) 150 200 Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 18 / 66 Traditional TCP Traditional "CP - Fairness I. a a fairness in a point of equilibrium a the fairness is considered for • streams with different RTT • streams with different MTU • The speed of convergence to the point of equilibrium DOES matter! Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 19 / 66 Traditional TCP cwnd + = MSS, cwnd * = 0,5 (30 steps) 100 80 - 60 -- 40 -- 20 - 0 0 20 40 60 80 100 Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 20 / 66 Traditional TCP cwnd + = MSS, cwnd*= 0,83 (30 steps) 100 80 - 60 -- 40 -- 20 - 0 0 20 40 60 80 100 Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 21 / 66 Improving the traditional TCP Multi-stream TCP Lecture overview Q Traditional TCP and its issues Q Improving the traditional TCP • Multi-stream TCP • WeblOO Q Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Q TCP Extensions with IP Support • QuickStart, E-TCP, FAST Q Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP O Conclusions Q Literature Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 22 / 66 Improving the traditional TCP Multi-stream TCP Multi-stream TCP • assumes multiple TCP streams transferring a single data flow o in fact, improves the TCP's performace/behavior just in cases of isolated packet losses • a loss of more packets usually affects more TCP streams • usually available because of a simple implementation • bbftp, GridFTP, Internet Backplane Protocol, . .. 9 drawbacks: • more complicated than traditional TCP (more threads are necessary) • the startup is accelerated linearly only 9 leads to a synchronous overloading of queues and caches in the routers Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 23 / 66 Improving the traditional TCP Multi-stream TCP TCP implementation tuning I. • cooperation with HW • Rx/Tx TCP Checksum Offloading o ordinarily available • zero copy • accessing the network usually leads to several data copies: user-land kernel network card page flipping - user-land Sunnyvale : RTT =181 ms ; Avg. throughput over a period of 7000s = 202 Mb/s Second stream GVA<->CHI : RTT =117 ms; Avg. throughput over a period of 7000s = 514 Mb/s Links utilization 71,6% Grid DT tuning in order to improve fairness between two TCP streams with different RTT: First stream GVA <-> Sunnyvale : RTT = 1 81 ms, Additive increment = A = 7 ; Average throughput = 330 Mb/s Second stream GVA<->CHI : RTT =117 ms, Additive increment = B = 3 ; Average throughput = 388 Mb/s Links utilization 71.8% Throughput of two streams with different RTT sharing a 1 Gbps bottleneck 1 iniii iMjii''|fjhr 1000 2000 3000 4000 Time (s) 5000 6000 ■A=7 ; RTT=181ms ■Average over the life of the connection RTT=181ms -B=3 ; RTT=117ms ■Average over the life of the connection RTT=117ms Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 29 / 66 35 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Scalable TCP • proposed by Tom Kelly [1] <* congestion control is not AIMD any more: • cwnd = cwnd + 0, 01 cwnd . .. per RTT without packet loss cwnd = cwnd + 0, 01 ... per ACK • cwnd = 0, 875 cwnd ... per packet loss =4> Multiplicative Increase Multiplicative Decrease (MIMD) • for smaller window size and/or higher loss rate in the network the Scalable-TCP switches into AIMD mode Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 30 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Scalable TCP Time (RTT) Time (RTT) Figure: Packet loss recovery times for the traditional TCP (left) are proportional to cwnd and RTT. A Scalable TCP connection (right) has packet loss recovery times that are proportional to connection's RTT only. (Note: link capacity c < C) Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 31 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Scalable TCP - fairness I. Two concurrent Scalable TCP streams, Scalable control switched on when >30Mb/s, twiced number of steps in comparison with previous simulations 100 80 - 60 - 40 - 20 - 0 0 20 40 60 80 100 Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 32 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Scalable TCP - fairness II. Scalable TCP and traditional TCP streams, Scalable control switched on when >30Mb/s, twiced number of steps Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Scalable TCP - Response curve 0.0001 0.001 0.01 0.1 Loss rate Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 34 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP High-Speed TCP (HSTCP) Sally Floyd, RFC3649, [2] congestion control AIMD/MIMD: • cwnd = cwnd + a(cwnd) . .. per RTT without loss i i . a(cwnd) cwnd = cwnd -\—-—-r1 cwnd ... per ACK • cwnd = b(cwnd) cwnd ... per packet loss emulates the behavior of traditional TCP for small window sizes and/or higher packet loss rates in the network Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 35 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP High-Speed TCP (HSTCP) • proposed MIM D parametrization = -0.4(log(c«W) - 3,64) + 0< 5 a(cwnd) = 7,69 2cwnd b(cwnd) 12,8(2 - b(cwnd))w^2 m m C 80000 70000 60000 50000 40000 30000 20000 10000 2( 30 400 6( 800 ' 1000 V 1200 1400 1600 1800 20C time [RTT] Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 36 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP High-Speed TCP (HSTCP) a parametrization equivalent to the Scalable-TCP is possible: Linear HSTCP • a comparison with the Multi-stream TCP N(cwnd) « 0,23o/i/77c/0'4 • N(cwnd) - the number of parallel TCP connections emulated by the HighSpeed TCP response function with congestion window cwnd Neither Scalable TCP nor HSTCP (sophistically) deal with the slow-start phase. Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 37 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP HSTCP - Response curve 100000 Ltl \ -F m -F TJ Iii —i £ľ ■iH "O Ľ Oj ■M 10000 h 1000 h 100 h 10 h ÍÍ8A-3, 3S> Regular- TCP Highspeed TCP Scalable TCP _L _L _L 1 le-10 le-09 le-0S le-07 J_ _L _L _L X 1&-0Ě le-05 0.0001 0.001 Loss Rate P 0.01 0. 1 Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 38 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP H-TCP I. • created by researchers at the Hamilton Institute in Ireland • a simple change to cwnd increase function 9 increases its aggressiveness (in particular, the rate of additive increase) as the time since the previous loss (backoff) increases • increase rate a is a function of the elapsed time since the last backoff • the AIMD mechanism is used • preserves many of the key properties of standard TCP: fairness, responsiveness, relationship to buffering H-TCP II. Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP • A . .. time elapsed from last congestion experienced • A/_ ... for A < Ai a TCP's grow is used • Aß . .. the bandwidth threshold, above which the TCP fall is used (for significant bandwidth changes the 0.5 fall is used) • Tm;m Tmax ... the minimal resp. maximal RTTs measured 9 B(k) . .. maximum throughput measurement for the last interval without packet loss Eva Hladká f Fl MU) Autumn 2016 40 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP H-TCP III. • cwnd = cwnd + 2(1 ^)j(A) cwnd ... per ACK • cwnd = b(B) cwnd ... per loss 3(A) = b(B) = max{a'(A)Tm(„; 1} 0,5 mm{7™;; °>8) in the other case A < AL A > AL B(k+1)-B(k) B(k) > A B max a'(A) = 1 + 10(A - AL) + 0,5(A - AL) quadratic increment function Eva Hladká (Fl MU) Autumn 2016 41 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP BIC-TCP • the default algorithm in Linux kernels (2.6.8 and above) • uses binary-search algorithm for cwnd update [3] • 4 phases: (1) a reaction to a packet loss (2) additive increase (3) binary search (4) maximum probing Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP BIC-TCP (1) Packet loss • BIC-TCP starts from the TCP slow start • when a loss is detected, it uses multiplicative decrease (as standard TCP) and sets the windows just before and after loss event as: • previous window size —>• Wmax (the size of cwnd before the loss) • reduced window size —>> Wm-m (the size of cwnd after the loss) • =^> because the loss occured when cwnd < Wmax, the point of equilibrium of cwnd will be searched in the range {Wm-,n\ Wmax] (2) Additive increase • starting the search from cwnd = Wmin+Wmax might be too challenging for the network • thus, when Wmin+Wmax > Wm;n + Smax, the additive increase takes place -» cwnd = Wmin + Smax • the window linearly increases by Smax every RTT Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 43 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP BIC-TCP (3) Binary search • once the target (cwnd = J^WL^W) js reached, the Wmin = cwnd • otherwise (a packet loss happened) Wmax = cwnd • and the searching continues to the new target (using the additive increase, if necessary) until the change of cwnd is less than the Sm/^constant • here, cwnd = Wmax is set The points (2) and (3) lead to linear (additive) increase, which turns into logarithmic one (binary search). Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 44 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP BIC-TCP (4) Maximum probing • inverse process to points (3) and (2) • first, the inverse binary search takes place (until the cwnd growth is greater than Smax) • once the cwnd growth is greater than Smax, the linear growth (by a reasonably large fixed increment) takes place • first exponencial growth, then linear growth Assumed benefits: • traditional TCP "friendliness" • during the "plateau" (3), the TCP flows are able to grow • AIMD behavior (even though faster) during (2) and (4) phases • more stable window size =4> better network utilization • most of the time, the BIC-TCP should spend in the "plateau" (3) Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 45 / 66 Conservative Extensions to TCP Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP CUBIC-TCP « even though being pretty good scalable, fair, and stable, BIC's growth function is considered to be still aggressive for TCP • especially under short RTTs or low speed networks • CUBIC-TCP o a new release of BIC, which uses a cubic function • for the purpose of simplicity in protocol analysis, the number of phases was further reduced where C is a scaling constant, T is the time elapsed since last loss event, Wmax is the window Wcubic = C(T - K f + W, max size before loss event, K .3/ Wmax[3 and f3 is a constant decrease factor CWND Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 46 / 66 TCP Extensions with IP Support QuickStart, E-TCP, FAST Lecture overview Q Traditional TCP and its issues Q Improving the traditional TCP • Multi-stream TCP • WeblOO Q Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Q TCP Extensions with IP Support • QuickStart, E-TCP, FAST Q Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP O Conclusions Literature Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 47 / 66 TCP Extensions with IP Support QuickStart, E-TCP, FAST Quickstart (QS)/Limited Slowstart I. • there is a strong assumption, that the slow-start phase cannot be improved without an interaction with lower network layers • a proposal: 4-byte option in IP header, which comprises of QS TTL and Initial Rate fields 9 sender, which wants to use the QS, sets the QS TTL to an arbitrary (but high enough) value and the Initial Rate to requested rate, which it wants to start the sending at, and sends the SYN packet Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 48 / 66 TCP Extensions with IP Support QuickStart, E-TCP, FAST Quickstart (QS)/Limited Slowstart II. • each router on the path, which support the QS, decreases the QS TTL by one and decreases the Initial Rate, if necessary • receiver sends the QS TTL and Initial Rate in the SYN/ACK packet to the sender • sender knows, whether all the routers on the path support the QS (by comparing the QS TTL and the TTL) • sender sets the appropriate cwnd and starts using its congestion control mechanism (e.g., AIMD) • Requires changes in the IP layer! :-( Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 49 / 66 E-TCP I. TCP Extensions with IP Support QuickStart, E-TCP, FAST • Early Congestion Notification (ECN) • a component of Advanced Queue Management (AQM) • a bit, which is set by routers when a congestion of link/buffer/queue is coming 9 ECN flag has to be mirrored by the receiver • the TCP should react to the ECN bit being set in the same way as to a packet loss • requires the routers' administrators to configure the AQM/ECN :-( Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 50 / 66 E-TCP II. TCP Extensions with IP Support QuickStart, E-TCP, FAST E-TCP o proposes to mirror the ECN bit just once (for the first time only) • freezes the cwnd when an ACK having ECN-bit set is received from the receiver • requires introducing of small (synthetic) losses to the network in order to perform multiplicative decrease because of fairness • requires a change in receivers' behavior to ECN bit :-( Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 51 / 66 TCP Extensions with IP Support QuickStart, E-TCP, FAST FAST • Fast AQM Scalable TCP (FAST) [5] • uses end-to-end delay, ECN and packet losses for congestion detection / avoidance • if too few packets are queued in the routers (detected by RTT monitoring), the sending rate is increased • differences from the TCP Vegas: • TCP Vegas makes fixed size adjustments to the rate, independent of how far the current rate is from the target rate • FAST TCP makes larger steps when the system is further from equilibrium and smaller steps near equilibrium • if the ECN is available in the network, FAST TCP can be extended to use ECN marking to replace/supplement queueing delay and packet loss as the congestion measure Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 52 / 66 Approaches Different from TCP Lecture overview Q Traditional TCP and its issues Q Improving the traditional TCP • Multi-stream TCP • WeblOO Q Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Q TCP Extensions with IP Support • QuickStart, E-TCP, FAST Q Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP O Conclusions Literature Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 53 / 66 Approaches Different from TCP tsunami tsunami • TCP connection for out-of-band control channel • connection parameters negitiation • requirements for retransmissions - uses NACKs instead of ACKs • connection termination negotiation • UDP channel for data transmission • MIMD congestion control highly configurable/customizable 9 MIMD parameters, losses threshold, maximum size of the queue for retransmissions, the interval of sending the retransmissions' requests, etc. 9 Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 54 / 66 Approaches Different from TCP RBUDP Reliable Blast UDP - RBUDP • similar to tsunami - out-of-band TCP channel for control, UDP for data transmission • proposed for disk-to-disk transmissions, resp. the transmissions where the complete transmitted data could be saved in the sender's memory • sends data in a user-defined rate • app_perf (a clon of iperf) is used for an estimation of networks'/receivers' capacity Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 55 / 66 iable Blast UDP - RBUDP Sender Receiver ----> UDP data traffic > TCP signaling traffic Figure 1. The Time Sequence Diagram of RBUDP Source: E. He, J. Leigh, O. Yu, T. A. DeFanti, "Reliable Blast UDP: Predictable High Performance Bulk Data Transfer," IEEE Cluster Computing 2002, Chicago, Illinois, Sept, 2002. A start of the transmission (using pre-defined rate) B end of the transmission C sending the DONE signal via the control channel; the receiver responses with a mask of data, that had arrived D re-sending of missing data E-F-G end of transmission The steps C and D repeat until all the data are delivered. 56 / 66 Behind Traditional TCP mn 2016 Approaches Different from TCP XCP eXplicit Control Protocol - XPC • uses a feedback from routers per paket ' o o o o o o ■ ' o ooo o o ' RÔL Round Trip Time Conla Congestion Window , Feedback = + 0.1 packet Congestion Header Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 57 / 66 Approaches Different from TCP XCP eXplicit Control Protocol - XPC • uses a feedback from routers per paket ■ Round Trip Time Congestion Window Feedback = -0.3 packet Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 57 / 66 Approaches Different from TCP XCP eXplicit Control Protocol - XPC • uses a feedback from routers per paket Congestion Window = Congestion Window + Feedback Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 57 / 66 Approaches Different from TCP SCTP, DCCP, STP, Reliable UDP, XTP Different approaches I. • SCTP • multi-stream, multi-homed transport (end node might have several IP addresses) • message-oriented like UDP, ensures reliable, in-sequence transport of messages with congestion control like TCP • http://www.sctp.org/ • DCCP • non-reliable protocol (UDP) with a congestion control compatible with the TCP • http://www.ietf.org/html.charters/deep-charter.html • http://www.icir.org/kohler/dcp/ Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 58 / 66 Approaches Different from TCP SCTP, DCCP, STP, Reliable UDP, XTP Different approaches II. • STP • based on CTS/RTS • a simple protocol designed for a simple implementation in HW • without any sophisticated congestion control mechanism • http://lwn.net/2001/features/OLS/pdf/pdf/stlinux.pdf • Reliable UDP 9 ensures reliable and in-order delivery (up to the maximum number of retransmissions) 9 RFC908 a RFC1151 9 originally proposed for IP telephony • connection parameters can be set per-connection • http://www.j avvin.com/protocolRUDP.html 9 XTP (Xpress Transfer Protocol), ... Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 59 / 66 Conclusions Lecture overview Q Traditional TCP and its issues Q Improving the traditional TCP • Multi-stream TCP • WeblOO Q Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Q TCP Extensions with IP Support • QuickStart, E-TCP, FAST Q Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP O Conclusions Q Literature Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 60 / 66 Conclusions I. Conclusions • Current state: • multi-stream TCP is intensively used (e.g., Grid applications) • looking for a way which will allow safe (i.e., backward compatible) development/deployment of post-TCP protocols • aggressive protocols are used on private/dedicated networks/circuits (e.g., A-networks Czechl_ight/CESNET2, SurfNet, CaNET*4, ...) • implementation SCTP under FreeBSD 7.0 • implementation DCCP under Linux Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 61 / 66 Conclusions I. Conclusions interaction with L3 (IP) interaction with data link layer • variable delay and throughput in wireless networks • optical burst switching specific per-flow states in routers: e.g., per-flow setting for packet loss generation (—>► E-TCP) may help short-term flows with high capacity demands (macro-bursts) problem with scalability and cost :-( Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 62 / 66 Literature Lecture overview Q Traditional TCP and its issues Q Improving the traditional TCP • Multi-stream TCP • WeblOO Q Conservative Extensions to TCP • GridDT • Scalable TCP, High-Speed TCP, H-TCP, BIC-TCP, CUBIC-TCP Q TCP Extensions with IP Support • QuickStart, E-TCP, FAST Q Approaches Different from TCP • tsunami • RBUDP • XCP • SCTP, DCCP, STP, Reliable UDP, XTP Q Conclusions Q Literature Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 63 / 66 Literature Jacobson V. "Congestion Avoidance and Control", Proceedings of ACM SIGCOMM'88 (Standford, CA, Aug. 1988), pp. 314-329. ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z _■] Allman M., Paxson V., Stevens W. "TCP Congestion Control", RFC2581, Apr. 1999. http://www.rfc-editor.org/rfc/rfc2581.txt Brakmo L, Peterson L. "TCP Vegas: End to End Congestion Avoidance on a Global Internet", IEEE Journal of Selected Areas in Communication, Vol. 13, No. 8, pp. 1465-1480, Oct. 1995. ftp://ftp.cs.arizona.edu/xkernel/Papers/jsac.ps http://www.weblOO.org Hacker T. J., Athey B. D., Sommerfield J. "Experiences Using WeblOO for End-To-End Network Performance Tuning" http://www.weblOO.org/docs/ExperiencesUsingWeblOOforHostTuning.pdf Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 64 / Literature Kelly T. "Scalable TCP: Improving Performance in Highspeed Wide Area Networks", PFLDnet2003, http://datatag.web.cern.ch/datatag/pfldnet2003/papers/kelly.pdf, http://wwwlce.eng.cam.ac.uk/~ctk21/scalable/ Floyd S. "HighSpeed TCP for Large Congestion Windows", 2003, http://www.potaroo.net/ietf/all-ids/draft-floyd-tcp-highspeed-03.txt BIC-TCP, http://www.esc.ncsu.edu/faculty/rhee/export/bitcp/ Floyd S., Allman M., Jain A., Sarolahti P. "Quick-Start for TCP and IP", 2006, http://www.ietf.org/internet-drafts/draft-ietf-tsvwg-quickstart-02.txt Jin C, Wei D., Low S. H., Buhrmaster G., Bunn J., Choe D. H., Cottrell R. L. A., Doyle J C, Newman H., Paganini F., Ravot S., Singh S. "FAST - Fast AQM Scalable TCP." http://netlab.caltech.edu/FAST/ http://netlab.caltech.edu/pub/papers/FAST-infocom2004.pdf tsunami, http: //www. anml. iu. edu/anmlresearch.html Zdroj: E. He, J. Leigh, O. Yu, T. A. DeFanti, "Reliable Blast UDP: Predictable High Performance Bulk Data Transfer," IEEE Cluster Computing 2002, Chicago, Illinois, Sept, 2002. Eva Hladká (Fl MU) 4. Behind Traditional TCP . Autumn 2016 65 / Literature 9 Workshops PFLDnet 2003-2010 9 http: //datatag.web.cern.ch/datatag/pfldnet2003/program.html • http://www-didc.lbl.gov/PFLDnet2004/ • http://www.ens-lyon.fr/LIP/RESO/pfldnet2005/ • http://www.hpcc.jp/pfldnet2006/ • http://wil.cs.caltech.edu/pfldnet2007/ • prof. Sally Floyd's pages: • http://www.icir.org/floyd/papers.html • RFC3426 - "General Architectural and Policy Considerations" http://www.hamilton.ie/net/eval/results_HI2005.pdf Eva Hladká (Fl MU) 4. Behind Traditional TCP Autumn 2016 66 / 66