Data Transfer Planning Jakub Stoklasa PV177 Laboratory of Advanced Network Technologies April 14, 2010 • CoUniverse • Problem Description • Entities and Notation • Pre-processing part • Constraint Model • Experimental Testing Proposal • Simplified Problem and its Evaluation April 14,2010 Jakub Stoklasa 2 • Framework for building and self-organization of ad-hoc collaborative environments developed by Miloš Liška and Petr Holub (FI MUNI) • Continuous adaptation on changing conditions based on built-in monitoring - re-planning from scratch on change • Support for media streams with bitrate comparable to capacity of network links (e.g. HD video) - sophisticated scheduler needed • Visualization of the environment for the users to make it understandable • Uses a constraint based scheduler implemented in Java using a CHOCO solver library • My work extends the original scheduler and adds some new features April 14,2010 Jakub Stoklasa 3 CoUniverse - GLIF2007 conference , ř^> 't:-Z ■*#!£ ■=1 —- \ /C~ \ \\ April 14, 2010 Jakub Stoklasa Problem Description Network Organization Media Applications Media Application Distributor Applications on Nodes Restrictions Stream Links Media Streams Planning Problem Network Topology Examples April 14, 2010 Jakub Stoklasa • Network represented as a graph G = (V,E) • Vertices -> Nodes • Edges -> I//ife • Sites - geographical or logical (virtual) collocation of nodes - used to specify source for applications consuming data • Subnetworks - separated parts of the network • Interfaces - used to connect nodes within particular subnetworks - they describe a physical network infrastructure • Nodes - configured with data processing applications - applications define capabilities of the node • Links - full-mesh network topology between interfaces (of two different nodes) belonging to the same subnetwork April 14,2010 Jakub Stoklasa 6 Network Organization II subnet(ij) = subnet(ik) = subnet(in) = netx subnet(im) = subnetQj) = net2 node2 - "gateway" between netx and net2 Nodes, interfaces and links example April 14, 2010 Jakub Stoklasa Network Organization III Links are comprehended as end-to-end links between node interfaces and thus they do not reflect the structure of real physical network topology Each Link in our model may be built using a number of physical network links, switches and routers Physical network infrastructure vs. Link April 14, 2010 Jakub Stoklasa • Running on nodes • Capable of producing and/or consuming data • Communicate using streams of particular types • Stream - abstract entity - defined by its Producer and its Stream Type • Stream Type- data (video) format (e.g. HDTV, HDV MPEG2) - bandwidth, quality • Media Application Producer I Consumer - capable of producing / consuming one or more stream types - e.g. UltraGrid, VideoLAN Client (VLC), Polycom device • Media Application Distributor - special application for data distribution - receives a stream and proliferates it to other applications April 14,2010 Jakub Stoklasa 9 Media Application Distributor Application used for data distribution Consumes exactly one stream and is able to proliferate this (possibly transcoded) stream to more than one consumer (or other distributor/s) Transcoding type distributor (e.g. Active Element) - stream type of the input stream can be transcoded to some other stream type (dependent on distributor's capability) - stream producer is always preserved! Reflector type distributor - no transcoding capabilities - only exact copies of the input stream can be distributed - used in previous version of constraint based scheduler April 14, 2010 Jakub Stoklasa 10 • There are some restrictions on applications running on nodes for the input network: • Each node has to run either just one d e D mid no other application •Or i producers and/or y consumers where (/ +j) > 0, / > 0,y > 0 • Two applications (consumers) processing a stream from the same source cannot run on the same node • For example, two instances of UltraGrid consumers use fixed port number for addressing, thus cannot be listening for incoming media stream on a single node at the same time April 14,2010 Jakub Stoklasa 11 Stream Links Abstraction of a fact that a stream is being transmitted over particular network link Basic entity to be scheduled in the proposed model Representation of stream link in proposed model: sl(l,p, t), where / is a network link, p is a producer and t is a stream type April 14, 2010 Jakub Stoklasa 12 Media Streams Planning Problem The goal is to find such a set of media stream distribution trees (a forest) that all consumers are covered by producer/s from requested sites while satisfying all other conditions (distributors transcoding capabilities, links/interfaces capacities etc.) We want to optimize latency and/or quality of the solution stream producer type Sl apPp A s2 apPp B April 14, 2010 Media streams distribution tree example Jakub Stoklasa 13 Network Topologies Examples Used for testing the simplified media streams planning problem (a) 1 :n topology with a single distributor having sufficient capacity (b) 1 :n topology with several distributors creating a distribution network (c) m:n full-mesh topology with a number of distributors (a) April 14, 2010 00 Jakub Stoklasa (c) 14 Entities and Notation I Nodes (V) Vve V: siteiy) - particular site the node belongs to interfacesiv) - a set of interfaces belonging to the node v Interfaces (7) Vz g h subnet(i) -just one subnet the interface belongs to capacityI(i) - capacity of the interface Links (E) - directed link e = (i, j) where i, j e I are the terminal interfaces - such e exists iff subnet{i) = subnet(j) April 14, 2010 Jakub Stoklasa 15 Entities and Notation II Links (cont.) \/e g E: beginl(e) - beginning interface of the link e endl{e) - ending interface of the link e begin(e) - beginning node of the link e end(e) - ending node of the link e Note: one interface can be shared by more links! capacity{e) - capacity of the link e - determined by its interfaces (i.e. mm(capacityl(i), capacitylij))) or further by the network monitoring latency{e) - latency of the link e - determined solely by the network monitoring April 14, 2010 Jakub Stoklasa 16 Entities and Notation III Stream Types (T) We T: bandwidth(t) - bandwidth needed to transfer a stream of type t qualityit) - quality of a stream of type t Producers (P) Vp g P: nodeip) - parent node of the producer/? stream_types(p) - a set of stream types the producer/? is able to produce April 14, 2010 Jakub Stoklasa 17 Entities and Notation IV Consumers (Q Vc g C: node(c) - parent node of the consumer c stream_types(c) - a set of stream types the consumer c is able to consume requested_site(c) - a site from which the consumer c wants to receive data and where appropriate producer is sought Distributors (D) VdeD: node(d) - parent node of the distributor d transcode _ pairs (d) = {(tin,tout) in' out T} April 14, 2010 Jakub Stoklasa 18 Entities and Notation V Additional notation • "Belongs to node " notation VxePuCuflu/we write x e v meaning that producer/consumer/distributor/interface x belongs to the node v Sites (57) consumer s{sí) = { c | c e C a requested _site(c) = si } Distributors transcode_in(d) = {tin \ (tm,tout) e transcode_pairs(d) a tout e ľ) transcode_out(d) = {tout \ (tin,tout) e transcode_pairs{d) a tin e ľ) transcode(d, tin) = {tout \ (tm,tout) e transcode _ pairs (d) a tout e ľ) April 14, 2010 Jakub Stoklasa 19 Entities and Notation VI » Producers possible types(p) - a set of all types that streams of producer/? can acquire in the given network configuration = stream typesip) + their possible transcoding Distributors & Producers indeg(d,p) = ^ i sl(l,p,t) (IgE)a impossible _types(p)r\ (node (d)=end (/)) transcode _in{d) outdegid, p) = i i sl(Up,t) (IgE)a impossible _types(p)r\ (node(d)=begin(l)) transcode _out{d) April 14, 2010 Jakub Stoklasa 20 Pre-processing part I Eliminate inactive consumers (i.e. those where requested_site(c) = null) Eliminate producers from non-requested sites Generate a possible types(p) set for each producer/? Replace multi-input-type distributors by a set oi virtual distributors (single-input type) Motivation example: d: transcode_pairs{d) = {(A, B), (A, C), (B, C)} -^©^ -^©^ - we are not able to distinguish whether the type C was transcoded from type A or type B April 14, 2010 Jakub Stoklasa 21 Pre-processing part II Solution: we replace the original distributor by a set of virtual distributors on the particular node d: transcode_pairs(d) = {(A, B), (A, C), (B, C)} dx\ transcode_pairs{dx) = {(A, B), (A, C)} d2: transcode_pairs(d2) = {(B, C)} number of virtual distributors = transcode_in(d) restriction on just one distributor on a node does not apply any more (it is restriction on the input network) only one of the virtual distributors can be active Network links elimination - helps to reduce number of network links and consequently the number of domain variables, thus making the problem smaller April 14, 2010 Jakub Stoklasa 22 Network Links Elimination Eliminate edges that cannot be used for data transfer in our problem We will obtain significantly smaller number of stream links Stream link sl(l, p, f) will not be created for eliminated link / We want to find a set of edges = £\(£CUicUipUU elim where Zcap = {/ e EI capacity {I) < min({ bandwidth(t) 11 e possible _types (p) Ape P})} Lc=\l eE\ p č begin(l) a d £ begin(l)} Lp ={/ e E\c <£ end{l) a d <£ end (I)} Zsite = {/ e EI site(begin(l)) = site(end(l))} • In the following text we still denote a set of links as E for sake of brevity but we treat it as i?elim April 14, 2010 Jakub Stoklasa 23 Constraint Model Domain Variables Capacity and Bandwidth Links to Node Links from Node Distributors Cycle Elimination Direct Links Optimization Constraint Satisfaction Problem Search Strategies April 14, 2010 Jakub Stoklasa • We want to place requests (streams) to resources (network) • Stream = producer + type • Stream Links X={ sl(l,p,t) | / e E,p e P,t e possible _types(p) } • Boolean domain (D= {0, 1}) sl{l, p,t) = 0 — stream from producer p of type t is not transmitted over link / sl(l,p, t) = 1 - stream from producer/? of type t is transmitted over link / April 14,2010 Jakub Stoklasa 25 Capacity and Bandwidth Capacity of each interface i must be sufficient to transfer all streams that are transmitted over links using the interface \/i e I: ^ ^ ^ sl{l,p,t) x bandwidth (t) < capacity! (i) leE: peP tepossible _types(p) (i=beginl(l)) A(i=endl(l)) (1) Capacity of each link / must be sufficient to transfer all streams that are transmitted over the link V/ g E: ^ ^ sl(l,p,t) x bandwidth(t) < capacity (I) peP t e possible _ types (p) (2) Each link / must have sufficient capacity to transmit the stream of type t (redundant constraint) V/ g E\fp g P\ft g possible_types(p) (bandwidth(t) > capacity(l)): sl(l,p,t) = 0 (3) April 14, 2010 Jakub Stoklasa 26 Links to Node Each consumer c receives data by just one link carrying a stream of type it is able to consume and which contains data produced by a producer/? from the requested site \/c^C: Y, Z E sl(l,p,t) = 1 (/e£) a (ceend(l)) (peP) a tepossible _types(p)c\ (site (node (p))= stream _ type s (c) requested _ site (c)) (4) If there is neither an appropriate consumer nor an appropriate distributor at the ending node of the link /, this link cannot be used for transmitting the particular stream V/ e E \fp e P\ft e possible_types(p) ((—\3c e C ((c e end(l)) a (site(node(p)) = requested_site(c)) a (t e stream_types(c)))) a (—i3d e D ((d e end(l)) a (t e transcode_in(d)))): sl(l,p,t) = 0 (5) April 14, 2010 Jakub Stoklasa 27 Links from Node I Each producer p sends data by at most one link out of all beginning at the node it is placed on \/peP: Y, £ sl(l,p,t) < 1 (6) (leE)A(pebegin(l)) t e stream _types{p) At least one producer from each requested site has to send data to the respective consumer/s (possibly distributed by distributors) \/si e SI 3c e C (requested _site(c) = si): III sl(l,P,t)>l (7) (leE)A(pebegin(l)) (peP)A(site(node(p))=si) testream _types(p) April 14, 2010 Jakub Stoklasa 28 Links from Node II If there is neither an appropriate producer nor an appropriate distributor at the beginning node of the link /, this link cannot be used for transmitting the particular stream V7 e E\/p e P\/t e possible _types{p) ((/? £ begin{l)) v (t č stream _types(p))) a (-i3d e D ((d e beginQ)) a (t e transcode_out(d)))): sl(sj) = 0 April 14, 2010 Jakub Stoklasa 29 Distributors I Only one out of all (virtual!) distributors sharing a common parent node can be active VveV3d\ď'eD(ďevAď'ev): £ ^ indeg(d,p)P) = ° then outdeg(d,p) = 0 if indeg(d,p) = 1 then outdeg(d,p) > 1 \/d eDVpeP: if outdeg(d,p) = 0 then indeg(d,p) = 0 • Constraint for the first part of rules for indeg(d, p) \fd e D \fp e P : indeg(d,p) x outdeg(d,p) — outdeg(d,p) (11) • Constraint for the second part of rules for outdeg{d, p) \/d e D \/p e P : indeg{d,p) + outdeg(d,p) ^ 1 (12) April 14,2010 Jakub Stoklasa 31 Cycle Elimination To avoid cycles among the nodes with distributors, the cycle elimination constraint has to be used for each possible producer n = number of nodes with distributors (i.e. number of distributors before generating the virtual distributors) UC1U1C gCllClČUlllg HIC V111UČU UlMllUUlUl^ VpePVleE\/k(2k) (/): Oyi =begin(l))A(vj2 =end(l)) A(lesl(l,p,t)) For each possible producer and for each k smaller or equal than n. this constraint ensures that cycles among k distributor nodes are prohibited April 14, 2010 Jakub Stoklasa 32 Direct Links If there is more than one consumer for a particular site, data should be sent using some distributor and not directly from possible producer to respective consumers (redundant constraint) \/si e 57 V/ e E\/p e P\/c e consumers (si) \/t e possible _ types(p) (( consumers(si) > 1) a (site(node(p)) = si) (14) a (p e begin(l)) a (c e end (I))): s/(/, pj) = 0 Problem: this constraint can eliminate some feasible solutions April 14, 2010 Jakub Stoklasa 33 Optimization Latency minimization minimize ^^ ^ si'(/, p,t) x latency'(/) (15) leE p&P impossible_types{p) Quality maximization maximize 2^,2^ 2j sl(l, p, t) x quality (t) (16) leE peP tepossible_types{p) April 14, 2010 Jakub Stoklasa 34 Constraint Satisfaction Problem I A Set Of domain variables —X = { sl(l,p,t) \leE,peP,te possible _types(p) } A domain of the variables — D= {0, 1} A set of essential constraints - C = {(1),(2),(4)-(13)} A set of all constraints (including the redundant ones) - C = {(1)-(14)} A set of constraints for minimization problem - Cmin / C = {C/C+u(15)} A set of constraints for maximization problem - Cmax / C = {C/C+u(16)} A set of constraints for optimization problem - Cmulti / C+multi = {C/C+ u (15), (16)} *+ min ' ^ min max' ^ max April 14, 2010 Jakub Stoklasa 35 Constraint Satisfaction Problem II Consequently, we can define these corresponding CSPs: P = (X,D,Q P+ = (X, D, C+) ^min — v*-? ^A ^min/ ' * min — v^-? ^A ^ min/ ^max — v*-? ^A ^max/ ' -* max — v^-? ^A ^ max/ Mnulti ~~ v*-? ^' ^ multi/ ' ^ multi ~~ V*-? ^? ^ multi/ Each solution of described problems defines a forest where one tree in this forest corresponds to the data distribution of a set of streams from one producer to consumers We can have more distribution trees for one requested site (more than one producer can be active) April 14, 2010 Jakub Stoklasa 36 • Value ordering boolean variables - increasing (default), decreasing • Variable ordering static: leftmost - simple linearization of sl(l,p, t) array over / first (outer loop) and then over p and its t (inner loop) rightmost - simple linearization of sl(l,p, t) array over/? and its t first (outer loop) and then over / (inner loop) DFS - depth first search traversal from each possible producer BFS - breadth first search traversal from each possible producer dynamic: degree - based on the maximum number of constraints related with each variable April 14,2010 Jakub Stoklasa 37 Experimental Testing Proposal Input instances configuration: • Several different topologies (1 :n, m:n, ...) • Consumers capable of receiving more than one stream type to be able to evaluate the maximization of the quality feature • More sophisticated link latency values if possible to better evaluate the minimization of the latency feature Experimental tests evaluation: • Usage of different value and variable orderings • Times needed to find a solution for different input instances and different types of CSP (optimization, all solutions, usage of the redundant constraints, ...) • Time to find only a first solution - appropriate for siginificantly large problems where finding optimal solution can take a long time April 14, 2010 Jakub Stoklasa 38 • Solved by the original scheduler implemented by Miloš Liška and Petr Holub • Precomputed matching of consumers and producers • Only one producer from requested site can be active • Selection of the producer is not unambiguous - there can be more suitable producers in the requested site, in such case the producer is chosen as a first match • X= {sl(l,p) | / g E, p g P} - significantly smaller problem • producer {c) -just one producer for the consumer c • consumer s(p) - a set of consumers of the producer/? • Only reflector type distributors • Only latency minimization as an objective function April 14,2010 Jakub Stoklasa 39 Evaluation of the Simplified P. I Parameters of different topologies: topology- \\SIj D II II VII ll^ll \\E\\ HR i- II II^ehm II M V || ll^elim 1 unass(ŕťeiim) 11(11)11 \\o\ || F° l:n-s-2 5 1 40 10 20 6 0 3 22 l:n-s-4 11 1 220 44 176 13 0 1 77 l:n-s-8 23 1 1,012 184 1,472 29 0 1 165 l:n-s-16 47 1 4,324 752 12,032 61 0 1 341 l:n-s-32 95 1 17,860 3,040 97,280 125 0 1 693 l:n-r-2 5 1 40 10 20 6 0 3 22 l:n-r-3 9 2 144 36 108 22 3 6 55 l:n-r-4 13 3 312 78 312 57 16 39 77 l:n-r-5 17 1 544 136 680 116 55 292 99 l:n-r-6 21 5 840 210 1,260 205 156 2505 121 l:n-r-7 25 6 1,200 300 2,100 300 399 24,306 143 m:n-2 6 2 60 18 36 14 2 17 22 m:n-3 12 3 264 60 180 45 12 6 99 m:n-A 20 1 760 140 560 112 44 24 176 m:n-5 30 5 1,740 270 1,350 225 130 120 275 m:n-6 42 6 3,444 462 2,772 396 342 720 396 April 14, 2010 Jakub Stoklasa 40 Evaluation of the Simplified P. II 1 :n-s topology [ms] ||57 D 2 1 x 16 32 first min 2.0±0.4 2.0±0.4 6.4±0.5 5.6±0.5 40.4±0.5 40.6±0.5 664±ß 652±6 15,600±200 15,400±300 l:n-r topology [ms] SID 2 3 4 5 6 1 first 2.0±0.4 4.0Í0.4 min 2.0Í0.4 4.8±0.4 m:n topology [ms] 9.6±0.5 13.6±0.5 20.6±0.5 39.2±0.5 40.0±0.4 240.8±0.4 81±2 3,050±70 \SID II 2 3 4 5 6 first min 2.0±CU 2.0Í0.4 6.0Í0.4 6.4Í0.5 18±1 20.2±0.4 44.2±0.4 65.0±0.4 107.4±0.S 313±3 April 14, 2010 Jakub Stoklasa 41 Evaluation of the Simplified P. Ill Computational results for different variable and value ordering heuristics for selected topologies l:n-s-32 dec [ms] inc [ms] l:n-r-5 dec [ms] inc [ms m:n~r-A dec [ms] inc [ms leftmost 15,500±200 14,700±200 62,350±50 57,000±100 219,300±300 198,300±600 stream 15,400±200 14;700±200 510±20 475±1 88.4Í0.5 S4.SÍ0.4 dfs 16,400±80 16,200±300 304.6±0.8 283.4Í0.8 51±0 48±0 degree 15,100±140 15,000±300 42.6Í0.8 38.8±0.4 21.6±0.5 20.6±0.5 All experimental tests results presented here have been taken from the Data Transfer Planning with Tree Placement for Collaborative Environments article written by Petr Holub, Miloš Liška and Hana Rudová. I thank for being able to use them for this presentation. April 14, 2010 Jakub Stoklasa 42