Data Transfer Planning
Jakub Stoklasa
PV177 Laboratory of Advanced Network Technologies
April 14, 2010
• CoUniverse
• Problem Description
• Entities and Notation
• Pre-processing part
• Constraint Model
• Experimental Testing Proposal
• Simplified Problem and its Evaluation
April 14,2010                                             Jakub Stoklasa                                                                2
• Framework for building and self-organization of ad-hoc collaborative environments developed by Miloš Liška and Petr Holub (FI MUNI)
• Continuous adaptation on changing conditions based on built-in monitoring - re-planning from scratch on change
• Support for media streams with bitrate comparable to capacity of network links (e.g. HD video) - sophisticated scheduler needed
• Visualization of the environment for the users to make it understandable
• Uses a constraint based scheduler implemented in Java using a CHOCO solver library
• My work extends the original scheduler and adds some new features
April 14,2010                                             Jakub Stoklasa                                                                3
CoUniverse - GLIF2007 conference
, ř^> 't:-Z ■*#!£
■=1  —- \   /C~ \
\\
April 14, 2010
Jakub Stoklasa
Problem Description
Network Organization
Media Applications
Media Application Distributor
Applications on Nodes Restrictions
Stream Links
Media Streams Planning Problem
Network Topology Examples
April 14, 2010
Jakub Stoklasa
• Network represented as a graph G = (V,E)
• Vertices -> Nodes
• Edges -> I//ife
• Sites - geographical or logical (virtual) collocation of nodes
- used to specify source for applications consuming data
• Subnetworks - separated parts of the network
• Interfaces - used to connect nodes within particular subnetworks
- they describe a physical network infrastructure
• Nodes - configured with data processing applications
- applications define capabilities of the node
• Links - full-mesh network topology between interfaces (of two
different nodes) belonging to the same subnetwork
April 14,2010                                             Jakub Stoklasa                                                                6
Network Organization II
subnet(ij) = subnet(ik) = subnet(in) = netx
subnet(im) = subnetQj) = net2
node2 - "gateway" between netx and net2
Nodes, interfaces and links example
April 14, 2010
Jakub Stoklasa
Network Organization III
Links are comprehended as end-to-end links between node interfaces and thus they do not reflect the structure of real physical network topology
Each Link in our model may be built using a number of physical network links, switches and routers
Physical network infrastructure vs. Link
April 14, 2010
Jakub Stoklasa
• Running on nodes
• Capable of producing and/or consuming data
• Communicate using streams of particular types
• Stream - abstract entity
- defined by its Producer and its Stream Type
• Stream Type- data (video) format (e.g. HDTV, HDV MPEG2)
- bandwidth, quality
• Media Application Producer I Consumer
- capable of producing / consuming one or more stream types
- e.g. UltraGrid, VideoLAN Client (VLC), Polycom device
• Media Application Distributor
- special application for data distribution
- receives a stream and proliferates it to other applications
April 14,2010                                             Jakub Stoklasa                                                                9
Media Application Distributor
Application used for data distribution Consumes exactly one stream and is able to proliferate this (possibly transcoded) stream to more than one consumer (or other distributor/s)
Transcoding type distributor (e.g. Active Element)
- stream type of the input stream can be transcoded to some other stream type (dependent on distributor's capability)
- stream producer is always preserved! Reflector type distributor
- no transcoding capabilities
- only exact copies of the input stream can be distributed
- used in previous version of constraint based scheduler
April 14, 2010
Jakub Stoklasa
10
• There are some restrictions on applications running on nodes for the input network:
• Each node has to run either
just one d e D mid no other application •Or
i producers and/or y consumers where (/ +j) > 0, / > 0,y > 0
• Two applications (consumers) processing a stream from the same source cannot run on the same node
• For example, two instances of UltraGrid consumers use fixed port number for addressing, thus cannot be listening for incoming media stream on a single node at the same time
April 14,2010                                             Jakub Stoklasa                                                               11
Stream Links
Abstraction of a fact that a stream is being transmitted over particular network link
Basic entity to be scheduled in the proposed model Representation of stream link in proposed model: sl(l,p, t), where / is a network link, p is a producer and t is a stream type
April 14, 2010
Jakub Stoklasa
12
Media Streams Planning Problem
The goal is to find such a set of media stream distribution trees (a forest) that all consumers are covered by producer/s from requested sites while satisfying all other conditions (distributors transcoding capabilities, links/interfaces capacities etc.) We want to optimize latency and/or quality of the solution
stream	producer	type
Sl	apPp	A
s2	apPp	B
April 14, 2010
Media streams distribution tree example
Jakub Stoklasa
13
Network Topologies Examples
Used for testing the simplified media streams planning problem
(a)  1 :n topology with a single distributor having sufficient capacity
(b)  1 :n topology with several distributors creating a distribution network
(c) m:n full-mesh topology with a number of distributors
(a) April 14, 2010
00
Jakub Stoklasa
(c)
14
Entities and Notation I
Nodes (V)
Vve V:
siteiy) - particular site the node belongs to
interfacesiv) - a set of interfaces belonging to the node v
Interfaces (7)
Vz g h
subnet(i) -just one subnet the interface belongs to
capacityI(i) - capacity of the interface
Links (E)
- directed link e = (i, j) where i, j e I are the terminal interfaces
- such e exists iff subnet{i) = subnet(j)
April 14, 2010
Jakub Stoklasa
15
Entities and Notation II
Links (cont.)
\/e g E:
beginl(e) - beginning interface of the link e
endl{e) - ending interface of the link e
begin(e) - beginning node of the link e
end(e) - ending node of the link e
Note: one interface can be shared by more links!
capacity{e) - capacity of the link e
- determined by its interfaces (i.e. mm(capacityl(i), capacitylij))) or further by the network monitoring
latency{e) - latency of the link e
- determined solely by the network monitoring
April 14, 2010
Jakub Stoklasa
16
Entities and Notation III
Stream Types (T)
We T:
bandwidth(t) - bandwidth needed to transfer a stream of type t
qualityit) - quality of a stream of type t
Producers (P) Vp g P:
nodeip) - parent node of the producer/? stream_types(p) - a set of stream types the producer/? is able
to produce
April 14, 2010
Jakub Stoklasa
17
Entities and Notation IV
Consumers (Q
Vc g C:
node(c) - parent node of the consumer c
stream_types(c) - a set of stream types the consumer c is able
to consume requested_site(c) - a site from which the consumer c wants to
receive data and where appropriate producer
is sought Distributors (D)
VdeD:
node(d) - parent node of the distributor d
transcode _ pairs (d) = {(tin,tout)
in'   out
T}
April 14, 2010
Jakub Stoklasa
18
Entities and Notation V
Additional notation
• "Belongs to node " notation
VxePuCuflu/we write x e v meaning that producer/consumer/distributor/interface x belongs to the node v
Sites (57)
consumer s{sí) = { c | c e C a requested _site(c) = si }
Distributors
transcode_in(d) = {tin \ (tm,tout) e transcode_pairs(d) a tout e ľ) transcode_out(d) = {tout \ (tin,tout) e transcode_pairs{d) a tin e ľ) transcode(d, tin) = {tout \ (tm,tout) e transcode _ pairs (d) a tout e ľ)
April 14, 2010
Jakub Stoklasa
19
Entities and Notation VI
» Producers possible types(p)
- a set of all types that streams of producer/? can
acquire in the given network configuration = stream typesip) + their possible transcoding
Distributors & Producers indeg(d,p)   =        ^
i
sl(l,p,t)
(IgE)a             impossible _types(p)r\
(node (d)=end (/))         transcode _in{d)
outdegid, p) =
i
i
sl(Up,t)
(IgE)a             impossible _types(p)r\
(node(d)=begin(l))       transcode _out{d)
April 14, 2010
Jakub Stoklasa
20
Pre-processing part I
Eliminate inactive consumers (i.e. those where requested_site(c) = null) Eliminate producers from non-requested sites Generate a possible types(p) set for each producer/?
Replace multi-input-type distributors by a set oi virtual distributors (single-input type) Motivation example:
d: transcode_pairs{d) = {(A, B), (A, C), (B, C)}
-^©^
-^©^
- we are not able to distinguish whether the type C was transcoded from type A or type B
April 14, 2010
Jakub Stoklasa
21
Pre-processing part II
Solution: we replace the original distributor by a set of virtual distributors on the particular node
d: transcode_pairs(d) = {(A, B), (A, C), (B, C)} dx\ transcode_pairs{dx) = {(A, B), (A, C)} d2: transcode_pairs(d2) = {(B, C)}
number of virtual distributors = transcode_in(d) restriction on just one distributor on a node does not apply any more (it is restriction on the input network) only one of the virtual distributors can be active
Network links elimination
- helps to reduce number of network links and consequently the
number of domain variables, thus making the problem smaller
April 14, 2010
Jakub Stoklasa
22
Network Links Elimination
Eliminate edges that cannot be used for data transfer in our problem We will obtain significantly smaller number of stream links Stream link sl(l, p, f) will not be created for eliminated link / We want to find a set of edges
= £\(£CUicUipUU
elim
where
Zcap = {/ e EI capacity {I) < min({ bandwidth(t) 11 e possible _types (p) Ape P})}
Lc=\l eE\ p č begin(l) a d £ begin(l)} Lp ={/ e E\c <£ end{l) a d <£ end (I)} Zsite = {/ e EI site(begin(l)) = site(end(l))}
• In the following text we still denote a set of links as E for sake of brevity but we treat it as i?elim
April 14, 2010
Jakub Stoklasa
23
Constraint Model
Domain Variables
Capacity and Bandwidth
Links to Node
Links from Node
Distributors
Cycle Elimination
Direct Links
Optimization
Constraint Satisfaction Problem
Search Strategies
April 14, 2010
Jakub Stoklasa
• We want to place requests (streams) to resources (network)
• Stream = producer + type
• Stream Links
X={ sl(l,p,t) | / e E,p e P,t e possible _types(p) }
• Boolean domain (D= {0, 1})
sl{l, p,t) = 0 — stream from producer p of type t is not transmitted
over link / sl(l,p, t) = 1 - stream from producer/? of type t is transmitted
over link /
April 14,2010                                             Jakub Stoklasa                                                              25
Capacity and Bandwidth
Capacity of each interface i must be sufficient to transfer all streams that are transmitted over links using the interface
\/i e I:     ^      ^          ^    sl{l,p,t) x bandwidth (t) < capacity! (i)
leE:         peP  tepossible _types(p)
(i=beginl(l)) A(i=endl(l))
(1)
Capacity of each link / must be sufficient to transfer all streams that are transmitted over the link
V/ g E:    ^           ^       sl(l,p,t) x bandwidth(t) < capacity (I)
peP    t e possible _ types (p)
(2)
Each link / must have sufficient capacity to transmit the stream of type t (redundant constraint)
V/ g E\fp g P\ft g possible_types(p) (bandwidth(t) > capacity(l)):    sl(l,p,t) = 0
(3)
April 14, 2010
Jakub Stoklasa
26
Links to Node
Each consumer c receives data by just one link carrying a stream of type it is able to consume and which contains data produced by a producer/? from the requested site
\/c^C:    Y,           Z            E       sl(l,p,t)    =   1
(/e£)
a (ceend(l))
(peP) a         tepossible _types(p)c\
(site (node (p))=            stream _ type s (c)
requested _ site (c))
(4)
If there is neither an appropriate consumer nor an appropriate distributor at the ending node of the link /, this link cannot be used for transmitting the particular stream
V/ e E \fp e P\ft e possible_types(p) ((—\3c e C ((c e end(l)) a (site(node(p)) =
requested_site(c)) a (t e stream_types(c)))) a (—i3d e D ((d e end(l)) a (t e transcode_in(d)))): sl(l,p,t) = 0
(5)
April 14, 2010
Jakub Stoklasa
27
Links from Node I
Each producer p sends data by at most one link out of all beginning at the node it is placed on
\/peP:        Y,
£        sl(l,p,t) < 1
(6)
(leE)A(pebegin(l))  t e stream _types{p)
At least one producer from each requested site has to send data to the respective consumer/s (possibly distributed by distributors)
\/si e SI 3c e C (requested _site(c) = si):
III       sl(l,P,t)>l    (7)
(leE)A(pebegin(l))       (peP)A(site(node(p))=si)       testream _types(p)
April 14, 2010
Jakub Stoklasa
28
Links from Node II
If there is neither an appropriate producer nor an appropriate distributor at the beginning node of the link /, this link cannot be used for transmitting the particular stream
V7 e E\/p e P\/t e possible _types{p) ((/? £ begin{l)) v (t č stream _types(p))) a (-i3d e D ((d e beginQ)) a (t e transcode_out(d)))):    sl(sj) = 0
April 14, 2010
Jakub Stoklasa
29
Distributors I
Only one out of all (virtual!) distributors sharing a common parent node can be active
VveV3d\ď'eD(ďevAď'ev):     £      ^ indeg(d,p)<l    (9)
- applied only if number of distributors on a node is more than one
- in case of one distributor (original) this constraint is not used as it would match the following constraint
Each distributor d can be used for distribution of at most one input stream (i.e. it receives the data by one link at most)
\/d e D : ^ indeg(d,p) < 1
peP
(10)
April 14, 2010
Jakub Stoklasa
30
• Data distribution constraints (distributors have to satisfy the following rules)
VdeDVpeP:     if indeg(d>P) = ° then outdeg(d,p) = 0
if indeg(d,p) = 1 then outdeg(d,p) > 1
\/d eDVpeP:    if outdeg(d,p) = 0 then indeg(d,p) = 0
• Constraint for the first part of rules for indeg(d, p)
\fd e D \fp e P : indeg(d,p) x outdeg(d,p) — outdeg(d,p)     (11)
• Constraint for the second part of rules for outdeg{d, p)
\/d e D \/p e P : indeg{d,p) + outdeg(d,p) ^ 1                     (12)
April 14,2010                                             Jakub Stoklasa                                                               31
Cycle Elimination
To avoid cycles among the nodes with distributors, the cycle elimination constraint has to be used for each possible producer n = number of nodes with distributors (i.e. number of distributors before generating the virtual distributors)
UC1U1C gCllClČUlllg  HIC  V111UČU UlMllUUlUl^
VpePVleE\/k(2<k<n) V/ (1 < / < Q)): 2                  2   sl(l,p,t)<(k-
(13)
1e possible _types(p)              jl,j2eC(n>k) (/):
Oyi =begin(l))A(vj2 =end(l)) A(lesl(l,p,t))
For each possible producer and for each k smaller or equal than n. this constraint ensures that cycles among k distributor nodes are prohibited
April 14, 2010
Jakub Stoklasa
32
Direct Links
If there is more than one consumer for a particular site, data should be sent using some distributor and not directly from possible producer to respective consumers (redundant constraint)
\/si e 57 V/ e E\/p e P\/c e consumers (si) \/t e possible _ types(p) (( consumers(si) > 1) a (site(node(p)) = si)                             (14)
a (p e begin(l)) a (c e end (I))):    s/(/, pj) = 0
Problem: this constraint can eliminate some feasible solutions
April 14, 2010
Jakub Stoklasa
33
Optimization
Latency minimization
minimize ^^          ^         si'(/, p,t) x latency'(/)         (15)
leE p&P     impossible_types{p)
Quality maximization
maximize 2^,2^          2j         sl(l, p, t) x quality (t)         (16)
leE peP     tepossible_types{p)
April 14, 2010
Jakub Stoklasa
34
Constraint Satisfaction Problem I
A Set Of domain variables —X = { sl(l,p,t) \leE,peP,te possible _types(p) }
A domain of the variables — D= {0, 1}
A set of essential constraints - C
= {(1),(2),(4)-(13)}
A set of all constraints (including the redundant ones) - C
= {(1)-(14)}
A set of constraints for minimization problem - Cmin / C = {C/C+u(15)}
A set of constraints for maximization problem - Cmax / C = {C/C+u(16)}
A set of constraints for optimization problem - Cmulti / C+multi = {C/C+ u (15), (16)}
*+
min '   ^   min
max'   ^   max
April 14, 2010
Jakub Stoklasa
35
Constraint Satisfaction Problem II
Consequently, we can define these corresponding CSPs:
P = (X,D,Q P+ = (X, D, C+)
^min — v*-? ^A ^min/ ' *    min — v^-? ^A ^  min/ ^max — v*-? ^A ^max/ ' -*    max — v^-? ^A ^  max/ Mnulti ~~ v*-? ^' ^ multi/ ' ^  multi ~~ V*-? ^? ^  multi/
Each solution of described problems defines a forest where one tree in this forest corresponds to the data distribution of a set of streams from one producer to consumers
We can have more distribution trees for one requested site (more than one producer can be active)
April 14, 2010
Jakub Stoklasa
36
• Value ordering boolean variables - increasing (default), decreasing
• Variable ordering static: leftmost - simple linearization of sl(l,p, t) array over / first (outer loop) and then over p and its t (inner loop) rightmost - simple linearization of sl(l,p, t) array over/? and
its t first (outer loop) and then over / (inner loop) DFS - depth first search traversal from each possible producer BFS - breadth first search traversal from each possible producer dynamic: degree - based on the maximum number of constraints related with each variable
April 14,2010                                             Jakub Stoklasa                                                               37
Experimental Testing Proposal
Input instances configuration:
• Several different topologies (1 :n, m:n, ...)
• Consumers capable of receiving more than one stream type to be able to evaluate the maximization of the quality feature
• More sophisticated link latency values if possible to better evaluate the minimization of the latency feature
Experimental tests evaluation:
• Usage of different value and variable orderings
• Times needed to find a solution for different input instances and different types of CSP (optimization, all solutions, usage of the redundant constraints, ...)
• Time to find only a first solution - appropriate for siginificantly large problems where finding optimal solution can take a long time
April 14, 2010
Jakub Stoklasa
38
•  Solved by the original scheduler implemented by Miloš Liška and Petr Holub
• Precomputed matching of consumers and producers
• Only one producer from requested site can be active
•  Selection of the producer is not unambiguous - there can be more suitable producers in the requested site, in such case the producer is chosen as a first match
• X= {sl(l,p) | / g E, p g P} - significantly smaller problem
• producer {c) -just one producer for the consumer c
• consumer s(p) - a set of consumers of the producer/?
• Only reflector type distributors
• Only latency minimization as an objective function
April 14,2010                                             Jakub Stoklasa                                                               39
Evaluation of the Simplified P. I
Parameters of different topologies:
									
topology- \\SIj	D II     II VII	ll^ll	\\E\\	HR i-    II II^ehm II	M V           || ll^elim 1	unass(ŕťeiim)	11(11)11	\\o\   ||	F°
l:n-s-2	5	1	40	10	20	6	0	3	22
l:n-s-4	11	1	220	44	176	13	0	1	77
l:n-s-8	23	1	1,012	184	1,472	29	0	1	165
l:n-s-16	47	1	4,324	752	12,032	61	0	1	341
l:n-s-32	95	1	17,860	3,040	97,280	125	0	1	693
l:n-r-2	5	1	40	10	20	6	0	3	22
l:n-r-3	9	2	144	36	108	22	3	6	55
l:n-r-4	13	3	312	78	312	57	16	39	77
l:n-r-5	17	1	544	136	680	116	55	292	99
l:n-r-6	21	5	840	210	1,260	205	156	2505	121
l:n-r-7	25	6	1,200	300	2,100	300	399	24,306	143
m:n-2	6	2	60	18	36	14	2	17	22
m:n-3	12	3	264	60	180	45	12	6	99
m:n-A	20	1	760	140	560	112	44	24	176
m:n-5	30	5	1,740	270	1,350	225	130	120	275
m:n-6	42	6	3,444	462	2,772	396	342	720	396
April 14, 2010
Jakub Stoklasa
40
Evaluation of the Simplified P. II
1 :n-s topology [ms]
||57
D
2
1
x
16
32
first min
2.0±0.4 2.0±0.4
6.4±0.5 5.6±0.5
40.4±0.5 40.6±0.5
664±ß 652±6
15,600±200 15,400±300
l:n-r topology [ms]					
SID                     2                   3		4	5	6	1
first                 2.0±0.4       4.0Í0.4 min                 2.0Í0.4       4.8±0.4 m:n topology [ms]		9.6±0.5 13.6±0.5	20.6±0.5 39.2±0.5	40.0±0.4 240.8±0.4	81±2 3,050±70
\SID II                  2	3		4	5	6
first min
2.0±CU 2.0Í0.4
6.0Í0.4 6.4Í0.5
18±1 20.2±0.4
44.2±0.4 65.0±0.4
107.4±0.S 313±3
April 14, 2010
Jakub Stoklasa
41
Evaluation of the Simplified P. Ill
Computational results for different variable and value ordering heuristics for selected topologies
l:n-s-32
dec [ms]         inc [ms]
l:n-r-5
dec [ms]         inc [ms
m:n~r-A
dec [ms]            inc [ms
leftmost   15,500±200   14,700±200   62,350±50   57,000±100   219,300±300   198,300±600
stream      15,400±200   14;700±200      510±20          475±1           88.4Í0.5          S4.SÍ0.4
dfs             16,400±80    16,200±300    304.6±0.8     283.4Í0.8           51±0                48±0
degree       15,100±140   15,000±300    42.6Í0.8       38.8±0.4         21.6±0.5          20.6±0.5
All experimental tests results presented here have been taken from the Data Transfer Planning with Tree Placement for Collaborative Environments article written by Petr Holub, Miloš Liška and Hana Rudová. I thank for being able to use them for this presentation.
April 14, 2010
Jakub Stoklasa
42