7. Peer-to-peer (P2P) networks II. PA159: Net-Centric Computing I. Eva Hladká Faculty of Informatics Masaryk University Autumn 2010 Lecture Overview I Routing in P2P Networks o Introduction, Motivation o Routing in Unstructured P2P Networks Routing in Structured P2P Networks Routing in Hybrid P2P Networks .2) Information Sources Eva Hladká (FIMU) 7. P2P networks II. Autumn 2010 2 / 72 1 Routing in P2P Networks Lecture Overview I l Routing in P2P Networks o Introduction, Motivation o Routing in Unstructured P2P Networks Routing in Structured P2P Networks Routing in Hybrid P2P Networks Information Sources 7. P2P networks II. Autumn 2010 3 / 72 Routing in P2P Networks Introduction, Motivation Routing in P2P Networks o routing of messages/requests is one of the key operations in P2P systems o to locate desired resources, each peer should be able to forward queries to a subset of neighbor peers that are closer to the destination than any other peer o — the design of routing protocols is one of the most widely researched issues o the key differences between the various schemes lie in the amount of information (metadata) being maintained at each peer and how this information is organized o no metadata == there is no other way for locating information except for flooding/broadcasting the request through the network 7. P2P networks II. Autumn 2010 4 / 72 Routing in P2P Networks Introduction, Motivation Routing in P2P Networks The Lookup Problem ^adka (F1 MU) 7. P2P networks II. Autumn 2010 5 / 72 Routing in P2P Networks Introduction, Motivation Simple, but 0(n) state information has to be maintained on a single central node, and the network suffers from a single point of failure. Eva Hladka (FI MU) 7. P2P networks II. Autumn 2010 6 / 72 Routing in P2P Networks Introduction, Motivation Robust, but in the worst case O(n) messages has to be transmitted per lookup. Eva Hladka (FI MU) 7. P2P networks II. Autumn 2010 7 / 72 Routing in P2P Networks Introduction, Motivation Routing in P2P Networks The Lookup Problem - Routed DHT Queries (Chord, CAN, Pastry, Tapestry, ...) 7. P2P networks II. Autumn 2010 8 / 72 Routing in P2P Networks Introduction, Motivation Routing in P2P Networks Evaluation Metrics The effectiveness/efficiency of a routing scheme can be evaluated by several metrics: o Storage o each peer may need to incur some storage space for maintaining metadata (used for searching) • storing more metadata == it is more costly to keep these data up-to-date o Efficiency o a system is efficient if it can locate the resources quickly o metric of efficiency is the response time (can be measured by the average query path length) o Usability o reflects the ease of use, and the types of queries that can be supported o e.g., depending on the metadata maintained, one system may support complex queries, while another one can perform an exact match only o Coverage refers to whether the search space contains the answers a scheme with a higher coverage is certainly more useful Scalability o important — makes the routing scheme useful in largescale environments o a measure of scalability — e.g, the number of messages that need to be routed in order to locate information Eva Hladka (FIMU) 7. P2P networks II. Autumn 2010 9 / 72 Routing in P2P Networks Routing in Unstructured P2P Networks Routing in Unstructured P2P Networks o each peer typically stores its own data objects and selfmaintains a set of links to neighbor nodes o when a node wants to join the system, it simply contacts an existing node and copies links of that node to form its own links o (later maintained independently on the contacted node) o == no peers have global knowledge of data placement o flooding-based techniques have to be used for queries o to alleviate the problem of flooding the system with query messages, a Time-to-Live (TTL) value is usually attached to each query o the challenge is, how to optimize query processing in the limited number of search steps constrained by TTL o several routing strategies have been proposed: o Breadth-First Search (BFS) - e.g., Gnutella o Depth-First Search (DFS) - e.g., FreeNet Heuristic-Based Routing Strategies 7. P2P networks II. Autumn 2010 10 / 72 Routing in P2P Networks Routing in Unstructured P2P Networks Depth-First Search (DFS) Algorithm 1 : FreeNet_Search (Node x, Key k, TTL t) result = Local_Search(k) if result = found then 3: return result to the requester node 4: else 5: if? = 0 then 6: return "not found" to the requester node 7: else repeat pick a neighbor node y in the routing table of x that has the nearest key to k and has not been searched before 10: result = FreeNet_Search(}>, k, t - 1) 11: until result = found or all neighbors have been searched 12: return result to the requester node 13: end if 14: end if Figure: FreeNet's routing strategy: instead of sending a query to all neighbors, each node selects the most promising neighbor that can answer the query and sends the query to only that node. If the node does not receive a reply within a certain period of time (or the answer cannot be found), the node selects a next promising neighbor. Hladká (FI MU) 7. P2P networks II. Autumn 2010 11/72 Routing in P2P Networks Routing in Unstructured P2P Networks Heuristic-Based Routing Strategies Iterative Deepening o the idea: o a query is initiated with a sequence of multiple traditional BFS searches by enlarging search radius gradually o the search process terminates when either the maximum depth is reached or the results for the query satisfy user's requirements o algorithm details: o a system policy P must be provided to specify the sequence of the depths at which the iteration happens □ P = D1, D2,..., Dn, where D1 < D2 < ... < Dn o under this policy, the source node first sends a query message to the network via BFS search of depth D1 • if the result obtained satisfies user's requirements, the query is terminated o otherwise, the source node issues another resend query message (with the same query ID) with a BFS depth of D2 • the nodes that are less than Di-hops away from the source node do nothing but just forward the query to their neighbors o the further nodes process the query in the same way as in the first iteration o similarly for D3, D4, etc. o if the query is not answered until the depth of D„, the search process terminates Eva Hladka (FIMU) 7. P2P networks II. Autumn 2010 12/72 Routing in P2P Networks Routing in Unstructured P2P Networks Heuristic-Based Routing Strategies Directed BFS and Intelligent Search I. the idea: □ in BFS, each node sends the query to all of its neighbors o in Directed BFS, each node only queries a subset of its neighbors o those neighbors then forward the query using the standard BFS • the key point is how to intelligently choose "good" neighbors that would potentially contribute more relevant results for the query o details on choosing the neighbors: □ each node maintains some statistics of its neighbors: the number of previously answered queries through a neighbor node, the number of results obtained, and the latency in receiving the results • based on these statistics, the node can choose the neighbors "intelligently" based on several heuristics, e.g.: o choose the one that returned the largest number of results previously • choose the one that incurred the least hop-count messages previously o choose the one that forwarded the largest number of messages previously o choose the one that have shortest message queues o etc. technique Eva Hladka (FI MU) 7. P2P networks II. Autumn 2010 13 / 72 Routing in P2P Networks Routing in Unstructured P2P Networks Heuristic-Based Routing Strategies Directed BFS and Intelligent Search II. o Directed BFS o advantage: the number of query messages in the network is greatly reduced as compared to standard BFS technique o disadvantage: the statistics stored about each neighbor are too simple o they do not contain information related to the content of queries o == Intelligent Search each peer ranks its neighbors based on their relevances to the query the query is routed only to those neighbors that have high relevances it thus provides a more exact ranking of peers than Directed BFS o has good performance in networks that exhibit a high degree of query locality 7. P2P networks II. Autumn 2010 14 / 72 Routing in P2P Networks Routing in Unstructured P2P Networks Heuristic-Based Routing Strategies Local Indices Search o the idea: each node creates and maintains indices for both its local data and the data on its neighbor nodes that are within a radius of k hops from it o if k = 0, this method is similar to BFS search (local data index only) the result returned at such a node is the same as the result, which would be returned by processing the query at all the nodes within a radius of k hops from the node o details: o the queries are processed based on a global policy P that specifies a list of depths in the search tree where the query is processed o just the nodes located at the depth specified in P process the query o the other nodes simply forward the query to their neighbors (without processing it) o advantage: reducing the processing cost by limiting the query processing to fewer nodes o disadvantages: higher storage cost (more indices need to be stored at a node) higher update cost for these indices inconsistency/obsolescence of the indices (due to dynamics of the network) Eva Hladka (FIMU) 7. P2P networks II. Autumn 2010 15/72 Routing in P2P Networks Routing in Unstructured P2P Networks Heuristic-Based Routing Strategies Random Walk I. a the idea: o when a peer issues/receives a query, it randomly selects a neighbor to send or forward the query to o this process repeats until the search result is found o or TTL expires (if employed) == the result is not found o the main disadvantage: o it suffers from long delays in query processing o == k-walker Random Walk Algorithm o the query initiator (the source node) sends k query messages to its randomly-selected neighbors (instead of just a single one = the original 1-walker algorithm) when a node receives a query message (a walker), it just follows the basic random walk to randomly select a single neighbor to forward the query to the number of messages (visited nodes) increases linearly as compared to the 1-walker algorithm details: Eva Hladká (Fl MU) ľ. P2P networks II. Autumn 2010 16 / 72 Routing in P2P Networks Routing in Unstructured P2P Networks Heuristic-Based Routing Strategies Random Walk II. o details cont'd.: o == Random Breadth First Search (RBFS) o similar to the k-walker Random Walk o the query initiator first randomly selects a subset of its neighbors to send the query to each of these neighbors then randomly selects a subset of its neighbors, where the query is forwarded etc. o the number of messages (visited nodes) increases exponentially as compared to the 1-walker algorithm 7. P2P networks II. Autumn 2010 17 / 72 Routing in P2P Networks Routing in Unstructured P2P Networks Heuristic-Based Routing Strategies Adaptive Probabilistic Search (APS) o the idea: o a search method that combines techniques of both k-walker random search and probabilistic search o the main difference between APS and random walkers: o random walkers send the query to random neighbors while APS sends the query to neighbor nodes based on some probabilities o == each peer contains a probability for each neighbor with respect to each object (determined from past results) o details: two approaches to update the probabilities: o Optimistic approach — the system proactively increases the probabilities for selected (= queried) neighbors along the search path and decreases their probabilities only if the walker passing through them terminates with a failure o Pessimistic approach — the system proactively decreases the probabilities for selected (= queried) neighbors along the search path and increases their probabilities when the walker passing through them terminates with a success o swapping-APS - each peer swaps between optimistic and pessimistic method based on an observation of the ratio of successful walkers for each object o weighted-APS - takes into account the location of objects Eva Hladka (FIMU) 7. P2P networks II. Autumn 2010 18/72 Routing in P2P Networks Routing in Unstructured P2P Networks Heuristic-Based Routing Strategies Interest-Based Shortcuts I. o the idea: each peer adds additional links on top of an existing searching network to improve the search performance o these links (called interest-based short-cuts) connect two peers having a similar interest o details: o when a peer issues a query, it first employs interest-based shortcuts to forward and process the query if the result is found, the search terminates otherwise, the normal query processing algorithm is used o shortcut construction: when a peer joins the system, it has no shortcuts after each successfully processed query, the query initiator adds shortcuts to peers providing the answers for that query each peer stores only a limited number of shortcuts that have the highest utility (due to space constraints) 7. P2P networks II. Autumn 2010 19 / 72 Routing in P2P Networks Routing in Unstructured P2P Networks 7. P2P networks II. Autumn 2010 20 / 72 Routing in P2P Networks Routing in Structured P2P Networks Routing in Structured P2P Networks o the unstructured P2P networks suffer from the problem of low searching efficiency o unlike in the unstructured P2P systems, participant nodes in a structured P2P system are required to organize into some fixed topologies o such as a ring (Chord), a multidimensional grid (CAN), a mesh (Pastry and Tapestry), or a multiple list (Skip Graph) o == when a node joins the system, it has to follow some strict procedures to set up its position can be guaranteed, that if a result of a query exists in the system, it will be found o moreover, in an efficient way - most systems can provide an answer for a query within O (log N) steps/messages (N = number of nodes) o disadvantage: the need for a network topology incurs high maintenance cost (changes in routing tables) o based on the overlay network structure, structured P2P systems can be classified into the following categories: o Distributed Hash Table (DHT) based systems - e.g., Chord, CAN, Tapestry and Pastry, Viceroy and Crescendo, etc. Skip List based systems - e.g., Skip Graph, SkipNet, etc. o Tree based systems - e.g., P-Grid, P-Tree, BATON, etc. Eva Hladká (FI MU) 7. P2P networks II. Autumn 2010 21 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Distributed Hash Table o every node in the P2P network manages its part of global hash table o storage/retrieval of an item s means quering the node, which manages the part, where the hash(s) belongs to 7. P2P networks II. Autumn 2010 22 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Chord I. o one of the most widely known routing mechanism in structured P2P networks o the idea: o uses a one-way consistent hash function to map each node and data item to an m-bit identifier in a single-dimensional identifier space • the hash function uses the node's IP address to generate an identifier for a node, and o the data item (or the key of the data item) to generate an identifier for the data o the identifier space must be chosen large enough (the probability of assigning the same identifier to different nodes should be negligible) o details: • the identifier space is a circle of numbers from 0 to 2m — 1 o the system assigns a key k to the first node n whose identifier is equal to or follows the identifier k in the circle space o i.e., the key k is assigned to the first node clockwise from k 7. P2P networks II. Autumn 2010 23 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Chord II. Figure: An identifier circle based on consistent hashing - keys K6 and K18 are assigned to the same node identifier N30 (obtained by hashing the IP address "202.120.224.102"). The key K56 (obtained by hashing the word "Sailing") is assigned to the node identifier N70; the key K100 is assigned to the node identifier N115; the nodes N42 and N120 store no data items. 7. P2P networks II. Autumn 2010 24 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Chord III. - Simple lookup algorithm Simple lookup algorithm: o each node only needs to know its immediate successor node o when a node receives a query request: o first, it checks its local storage to see if it holds the queried data item if yes, the result is returned to query sender if no, it forwards the query to its immediate successor node the lookup terminates, when the result is found • the identifier of a node's immediate successor exceeds the identifier of the queried data item => the result cannot be found o the complexity is O(N) (N = the number of nodes in the system) 7. P2P networks II. Autumn 2010 25 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Chord III. - Scalable lookup algorithm I. Scalable lookup algorithm: o instead of maintaining only a single immediate successor node, each node maintains a finger table consisting of m successor nodes o when a node n receives a query request: o if the node does not hold the queried data, it searches its finger table for a node n' with the highest node identifier that satisfies the condition n.id < n'.id < k • if such a node exists, the node n asks n' to find the key k o otherwise, the node n asks its immediate successor to find k the lookup terminates, when the result is found • the identifier of a node's immediate successor exceeds the identifier of the queried data item == the result cannot be found o the complexity is O(log N) (N = the number of nodes in the system) 7. P2P networks II. Autumn 2010 26 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Chord III. - Scalable lookup algorithm II. Figure: An example of finger table entries (left) and an example of a routing path for key K117 starting at node N7 (right). 7. P2P networks II. Autumn 2010 27 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Chord IV. System construction: o when a new node joins the system, it needs to: H) find its position in the Chord ring and obtain data it should be responsible for (based on keys) [2) initialize its finger table [3) update finger tables of other nodes to reflect the presence of it when an existing node leaves the system, it does not need to do anything 7. P2P networks II. Autumn 2010 28 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Content Addressable Network (CAN) I. o the idea: o a routing system built on a virtual d-dimensional Cartesian coordinate space o the system partitions the storage space into different zones, each of which is assigned to a node such a node stores all data items belonging to its zone the system uses a uniform hash function to map the data key value to a point p in the coordinate space (thus obtaining a d-tuple) o details: inserting a data item: y the data key value is mapped into a point p in the coordinate space .2) the node n, whose zone covers p, is found and contacted to store the new data item processing a query is similar if the result exists, it should be stored on the node covering the particular zone each node needs to maintain information about its neighbor nodes i.e., the nodes covering adjacent zones the routing is based on a simple greedy forwarding algorithm O in every step, a node having closer coordinates to the destination zone is chosen Eva Hladka (FIMU) 7. P2P networks II. Autumn 2010 29 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Content Addressable Network (CAN) II. (0.5-0.75,0.5-1.0) (0.0-0.5,0.5-1.0) D E — (0.0-0.5,0.0-0.5) B (0.5-1.0,0.0-0.5) ->(0.75-1.0,0.5-1.0) A Node E's Virtual cordinate zone E's neighbors: D and B Figure: A CAN system using two dimensional space with 5 nodes. Eva Hladká (Fl MU) ľ. P2P networks II. Autumn 2010 30 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Content Addressable Network (CAN) III. 0,5 + + * ^ + + A + 1 + 0,5 More paths might be used to reach the destination Looking for the data item having the key (0,6:0,8) Figure: An example of a data item lookup in a CAN system. 7. P2P networks II. Autumn 2010 31 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Content Addressable Network (CAN) IV. System construction: o when a new node joins the system, it needs to: Hj find an arbitrary node, which is already connected to the network r2) identify a zone, which might be divided, and ask its owner/maintainer node to split the zone into two parts o the original node keeps maintaining one part, the new node starts to maintain the second one [3) construct its own routing table and update the routing tables of its neighbors o when an existing node leaves the system, it has to ask its neighbor to merge the zones into a single one 7. P2P networks II. Autumn 2010 32 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems o a routing system based on PRR trees o PRR = Plaxton, Rajaraman, and Richa (1997) • a node identifier is an m-bit number broken up into a sequence of digits having the base 2b • e.g., a 128-bit identifier is broken up into 32 4-bit digits (b = 4, base = 24 == hexadecimal sequence of digits) o b . . .configuration parameter o a data item is stored on a node having the identifier, which shares the longest prefix with the data identifier o in every routing step, a neighbor node having a longer prefix in common with the destination node (longer by 1 digit, i.e., b bits) is chosen o the routing complexity is O(log2b N) o each peer has a routing table to route messages o organized in a fixed number of levels (= \log2b(N)]) and within each level a fixed number of entries (= 2b — 1) o row ID = the length of prefix in common with the destination node o column ID = next possible step Pastry I. o the idea: o details: Eva Hladka (FI MU) 7. P2P networks II. Autumn 2010 33 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Pastry II. / J I 4 5 7 8 9 b f i v \ \ a \ \ a \ \ y \ a a \ fi 6 u (► li fi li li li li fi fi fi 6 u 0 I 2 3 4 0 7 8 9 c t .1 v \ a \ \ a \ a \ a \ a a \ S 6 6 0 ii fi 6 s (< fi fi 6 fi fr i> .--< -> :> .i .1 .j :t .1 -> b .> .> 0 I I 3 4 5 0 7 8 !> e d <■ I v X \ a a a a X a a a a a a a fi u ti li 6 fi li u fi li It fi fi It fi 5 S :> :> .» A b :i it :> A .■» .} i a a n the node A routes the message join(X) to the node Z, which is the closest one to the key X ^> the node X receives a leaf set from the node Z and fills in its routing table (the table's i-th row is received from the i-th node on the path from A to Z) [4 the node X informs the nodes, which should insert it into their routing tables o when an existing node leaves the network: it has to pass the data it has managed to a neighbor the routing tables become automatically updated soon o the node becomes replaced with a node from its leaf set (one of its neighbors) Eva Hladká (FIMU) 7. P2P networks II. Autumn 2010 37 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Tapestry Tapestry: o another peer-to-peer overlay routing infrastructure based on PRR Trees, which is very similar to Pastry o the main difference between Pastry and Tapestry: o in Pastry, each routing hop extends the matching prefix o in Tapestry, each routing hop extends the matching suffix o (another slight differences also exist) 7. P2P networks II. Autumn 2010 38 / 72 Routing in P2P Networks Routing in Structured P2P Networks Distributed Hash Table (DHT) based P2P systems Comparison CAN Chord Pastry Routing performance 0( d * N1/d) O(log N) 0(logBN) Routing state 2d log N B * logBN + B Peers join/leave 2d (log N)2 logBN B = 2b Figure: The comparison of presented DHT-based routing mechanisms for structured P2P networks (the lookup performance view, the storage view, and the re-management during a node's join/leave view). Eva Hiadka (FI MU) 7. P2P networks II. Autumn 2010 39 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems Skip List structure I. o a skip list is a data structure for storing a sorted list of items using a hierarchy of linked lists o the lists connect increasingly sparse subsequences of the items o the lists are built in layers: o the bottom layer (level 0) is an ordinary ordered linked list • each higher layer acts as an "express lane" for the lists below, where an element in layer ; appears in layer ; + 1 with some fixed probability p o usually, p = 1/2 or p = 1/4 NIL NIL NIL NIL w 3 W 1 2 4 5 7 9 w -> •—> 10 heac 6 8 7. P2P networks II. Autumn 2010 40 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems Skip List structure II. A search for a target element: o begins at the head element in the top list and proceeds horizontally until the current element is greater than or equal to the target o if the current element is equal to the target, the target has been found o if the current element is greater than the target, the procedure is repeated after returning to the previous element and dropping down vertically to the next lower list o the expected cost of a search is (log1/p n)/p □ since p is a constant == O(log n) 7. P2P networks II. Autumn 2010 41 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems Skip List structure III. Figure: The searching process in a Skip List structure. 7. P2P networks II. Autumn 2010 42 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems Skip Graph I. the idea: o a routing system based on Skip Lists • pure Skip Lists are not suitable, since the top-level nodes may become overloaded o unlike pure Skip List, which has only one list at each level, a Skip Graph has many lists at each level each node participates in a list at each level o the system controls the lists, which a node belongs to, by a random membership vector (created when the node joins the system) o the number of levels is O(log N) lookup details: once a node issues a query: the search process always starts at the highest level of that node at each step, if there is a neighbor node at the same level that keeps a closer value to the search key, the node forwards the query to that neighbor otherwise, the node continues the search process at a lower lever the destination node containing the result is found when the search process reaches the bottom level o the query processing complexity is O(log N) Eva 7. P2P networks II. Autumn 2010 43 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems Skip Graph II. The Membership vector only defines, which lists the particular element belongs to (the lists are sorted by a data key). Eva Hladká (Fl MU) ľ. P2P networks II. Autumn 2010 44 / ľ2 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems Skip Graph III. Restricting to the lists containing the starting element of the search, we get a skip list (the pure skip list searching method can be used then): W i A .....r— ■ ( J M R i i ..................t...............* ........."1......... 1 ( J ■■■■»..... w A J M i i .................i.........................i................ ___________1_____________......__ i i i i i i _____ ____________ _______ _____________ .1___________ i i A & J AA i i R w 7. P2P networks II. Autumn 2010 45 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems Skip Graph IV. System construction: o when a new node (having an identifier X) joins the network: o based on its membership vector m(X), X joins the lists of nodes whose membership vector shares the same prefix with m(X) at different lengths in particular: o X first joins the list at level 0 (to the nodes containing keys closest to the X's key) o for every level / > 1, X links to the closest node Y having the same /-length prefix with the node X 7. P2P networks II. Autumn 2010 46 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems Skip Graph V. Figure: Step 1: Starting at an arbitrary node, find a nearest (data) key at level 0. 7. P2P networks II. Autumn 2010 47 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems Skip Graph VI. c W l»l A 100 j M 101 001 001 011 110 e ........... W t I M A 100 J 110 101 001 001 011 •0—0 1 001 100 001 011 110 101 Figure: Step 2: At each level ;, connect to the list with a matching prefix of the membership vector of length i. Eva Hladká (Fl MU) ľ. P2P networks II. Autumn 2010 4B / ľ2 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems SkipNet I. the idea: o a routing system very similar to Skip Graph o instead of Skip Lists, the SkipNet organizes nodes into rings a similarly to the Skip Graph, organized into levels as well o the nodes are sorted on each level based on a data key o on a particular level, every node has a pointer to its neighbors stored in its routing table o the pointers on the level h point to the nodes that are roughly 2h nodes to the left and right of the given node o all the nodes are connected by the root ring formed at level 0 o the routing/lookup mechanism and system construction are very similar to the Skip Graph's ones 7. P2P networks II. Autumn 2010 49 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems SkipNet II. Ring Ring Ring Ring Ring Ring Ring Ring 000 001 010 011 100 101 110 111 Figure: The full SkipNet routing infrastructure for an 8 node system, including the ring labels. 7. P2P networks II. Autumn 2010 50 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems SkipNet III. Figure: The routing tables for nodes A and V. Eva Hladka' (FI MU) 7. P2P networks II. Autumn 2010 51 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems SkipNet IV. A routing example: Routing from A to V Ring Ring Ring Ring Ring Ring Ring Ring 000 001 010 Oil 100 101 110 111 Figure: At first, the message is forwarded to a neighbor closer to the destination. 7. P2P networks II. Autumn 2010 52 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems SkipNet V. A routing example: Routing from A to V Figure: Node T's routing table. 7. P2P networks II. Autumn 2010 53 / 72 Routing in P2P Networks Routing in Structured P2P Networks Skip List based P2P systems SkipNet VI. A routing example: Routing from A to V Figure: Since there is a direct access to the node V at level 0, the lookup terminates. 7. P2P networks II. Autumn 2010 54 / 72 Routing in P2P Networks Routing in Structured P2P Networks Tree based systems P-Grid I. the idea: o the P-Grid is based on a virtual binary tree structure in which each peer maintains a leaf node of the tree o the system assigns each peer an identifier, which is the binary bit string representing the path from the root to the leaf node o each peer is then responsible for all data items whose prefix is equal to the peer identifier o for fault-tolerance purposes, multiple peers can be assigned the same identifier o for routing purposes, each peer further maintains a routing table 7. P2P networks II. Autumn 2010 55 / 72 Routing in P2P Networks Routing in Structured P2P Networks 7. P2P networks II. Autumn 2010 56 / 72 Routing in P2P Networks Routing in Structured P2P Networks Tree based systems P-Grid III. the routing/lookup mechanism: o when a peer n receives a query having the key k, it checks, whether its identifier is a prefix of k o if yes, it searches its local storage to find the result o if no, the peer looks up its routing table to find a closer neighbor node to forward the query o the maximum number of search steps is bounded by the height of the tree □ == the lookup performance is O(/og2 N) 7. P2P networks II. Autumn 2010 57 / 72 Routing in P2P Networks Routing in Structured P2P Networks Tree based systems P-Tree I. the idea: o in P-Grid, the balance of the tree structure cannot be guaranteed o P-Tree is based on a virtual balanced B+-Tree built on top of a Chord ring o each peer maintains: o a Chord node, which is a leaf node of the tree structure, and • a semi-independent B +-Tree, which is a peer's view of a fully independent B +-Tree o a fully independent B+-Tree at a peer is a B+-Tree, where the value stored at the peer is considered as the smallest value in the Chord ring □ a semi-independent B+-Tree contains all nodes in the leftmost root-to-leaf path of the corresponding fully independent B+-Tree o to make it easy for maintenance, ranges of B+-Tree nodes can be overlapped (see node C in the following figure) 7. P2P networks II. Autumn 2010 58 / 72 Routing in P2P Networks Routing in Structured P2P Networks Tree based systems P-Tree II. (a) Semi-independent B+-Trees maintained at P-Tree nodes. Eva Hladká (FI MU) ľ. P2P networks II. (b) The fully-independent B+-Tree at node A. Autumn 2010 59 / ľ2 Routing in P2P Networks Routing in Structured P2P Networks Tree based systems BATON I. the idea: o in comparison with standard tree-based structures, the BATON provides two main features: data is stored at both leaf nodes and internal nodes in addition to parent and child links, nodes in the BATON network also have adjacent links and neighbor links o adjacent link is used to connect a node to a node maintaining an adjacent range of values (adjacent to the range the node maintains) o neighbor link is used to connect a node with its neighbors (at the same level in the tree structure) having a distance 2', i > 0 from the node the purpose of these links is to avoid the bottleneck problem at the root of the tree structure in query processing 7. P2P networks II. Autumn 2010 60 / 72 Routing in P2P Networks Routing in Structured P2P Networks Tree based systems BATON II. Routing in P2P Networks Routing in Structured P2P Networks Tree based systems BATON III. lookup details: o when a peer x receives a query: [i> if the searched key falls into the range of values managed by x, it responds to the query 3 otherwise, it forwards the query to the farthest neighbor that is nearer to but not overshooting the searched key p> if such a neighbor does not exist, x forwards the query to either a child (if it exists) or an adjacent node of x in the search direction 7. P2P networks II. Autumn 2010 62 / 72 Routing in P2P Networks Routing in Structured P2P Networks Tree based systems BATON IV. [0-5) [8-12) [17-23) [38-41) [50-54)[57-61)[64-67)[69-73) [79-83)[86-90)[95-100) Figure: A lookup example in BATON: the node H wants to search for a data item (having the key 74) stored in the node C. Eva Hladká (Fl MU) 7. P2P networks II. Autumn 2010 63 / 72 Routing in P2P Networks Routing in Hybrid P2P Networks Routing in Hybrid P2P Networks o hybrid P2P systems organize the peers into a hierarchical network o powerful peers (superpeers, supernodes) lie in a high level, and o common peers (also named client peers) lie in lower levels o each common peer belongs to a supernode and does not connect with any other common peer that does not belong to the same supernode the general routing scheme in hybrid P2P networks: [i> a client peer sends a query to its supernode r2) the supernode searches its directory to determine which client peer or supernode has the desired answers p> the query is sent to the supernode that may have the desired answers o it uses its directory of all its client peers to answer the query the IP address of the client peer having the desired answers is returned to the query peer o the query peer exchanges resources with that peer o examples: o KaZaA, BestPeer, Edutella, etc. 7. P2P networks II. Autumn 2010 64 / 72 Routing in P2P Networks Routing in Hybrid P2P Networks Routing in Hybrid P2P Networks Edutella Figure: The Edutella network structure. A query routing in Edutella is first directed to superpeers in HyperCuP network (= HyperCube P2P network), where the suffix-based routing scheme could be employed. Eva Hladká (Fl MU) ľ. P2P networks II. Autumn 2010 6S / ľ2 Routing in P2P Networks Routing in Hybrid P2P Networks Routing in Hybrid P2P Networks Ultrapeers C9 C6 Figure: The modified Gnutella network with ultrapeers. Suppose that the resources requested by peer C12 are located at the peer C9: the peer C12 first requests its ultrapeer U4, then U4 floods the query to U2 via U1; U1 searches its reflector index and finds that C9 has the desired answers — it sends the IP address of C9 back to C12. 7. P2P networks II. Autumn 2010 66 / 72 Routing in P2P Networks Routing in Hybrid P2P Networks Routing in Hybrid P2P Networks Structured Superpeers Figure: The structured superpeers: the superpeers SO, S1, 52, and S3 control ranges (0, 4], (4, 8], (8,12], and (12, O], respectively. If the peer P1 requests key = 10, it first sends the lookup key to SO; SO relays the key to 52 (since S2 controls the range where the key belongs), which replies the query initiator with the IP of the relevant node storing the requested data. Eva Hladká (FI MU) 7. P2P networks II. Autumn 2010 67 / 72 P2P Routing Conclusion Structured vs. Unstructured P2P Networks Comparison structured P2P unstructured P2P routing based on a routing table flooding, random walk, ... lookup possibilities based on keys only possibility to ask more compex queries existing item is always found yes cannot be guaranteed critical part node join/disconnect lookup/routing Eva Hladká (Fl MU) P2P networks II Autumn 2010 68 / 72 Routing in P2P Networks P2P Routing Conclusion P2P Routing Conclusion Overview I. V, -i.ii, Overlay network Routing table Routing method Gnutella Unstructured. Random neighbors Breadth First Search with Random Time-to-Live topology FreeNet Unstructured. Random neighbors Depth First Search with Random Time-to-Live topology Chord Structured. Ring Neighbors at Repeatedly jump to the topology distances 2' in the farthest node in the routing ring table whose id is still less than the search key CAN Structured. Mesh Neighbors at Repeatedly travel through topology adjacent positions in the neighbor that is closer to the mesh the destination Pastry & Tapestry Structured, PRR Neighbors sharing Repeatedly forward the tree topology common prefix message to the neighbor identifier at different having the longest matching levels prefix identifier Viceroy Structured. Five neighbors: one Three steps: going up, going butterfly at the upper level. down, and vicinity search topology two at the lower level, and two at the same level Crescendo Structured. Chord-like A combination of Chord-like hierarchical ring neighbors at routing and the routing topology different ring levels between rings at different levels 7. P2P networks II. Autumn 2010 69 / 72 Routing in P2P Networks P2P Routing Conclusion P2P Routing Conclusion Overview II. Overlay network Routing table Structured, multiple linked lists topology Structured, hierarchical ring lopology Structured, binary tree lopology Structured, a combination of a B+-Tree and a Chord ring lopology Structured, balanced tree lopology Edutella & Ultrapeers Hybrid, a combination of structured and unstructured lopology Neighbors sharing common prefix membership vector at different lengths Neighbors are predecessors and successors at different ring levels A neighbor at the other side of the tree rooted at each internal node from the root to the leaf Neighbors are nodes in the left-most root-to-leaf path of the B+-Tree Neighbors are parent, children and Chord-like neighbors at the same level Neighbors exist only at superpeer level. At client side, each client peer connects to a superpeer Travel from the highest to the lowest level of the list. At each level, jump to the neighbor closer to the destination if such a neighbor exists Skip Graph-like routing, traveling from the highest to the lowest level of the ring. Travel from the root to the leaf. At each level, jump to the neighbor closer to the destination Travel from the root to the leaf. At each level, jump to the neighbor closer to the destination If not having full routing tables, go to parent. Otherwise, go to the neighbor or the child closer to the destination A client peer always routes its requests to its superpeer while routing at supper peer level depends on the topology employed at that Eva Hladká (Fl MU) ľ. P2P networks II. Autumn 2010 70 / 72 Information Sources Lecture Overview I Routing in P2P Networks • Introduction, Motivation • Routing in Unstructured P2P Networks Routing in Structured P2P Networks Routing in Hybrid P2P Networks .2) Information Sources Eva Hladká (FIMU) 7. P2P networks II. Autumn 2010 71 / 72 Information Sources P2P Information Sources Further information: o O. H. Vu at al. Peer-to-Peer Computing: Principles and Applications. Springer, 2010 o Milojicic et al. Peer-to-Peer Computing. HP Labs, 2002 o D. C. Verma. Legitimate Applications of P2P Networks. Wiley, 2004 o X. Shen, H. Yu, J. Buford, M. Akon. Handbook of Peer-to-Peer Networking. Spriger, 2010 o J. Buford, H. Yu, E. K. Lua. P2P Networking and Applications. Morgan Kaufmann, 2009 Acknowledgement: • Prepared with the use of Dr. Kevin Vella's lecture: Introduction to Peer-to-Peer Computing 7. P2P networks II. Autumn 2010 72 / 72