P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems PART 1: Replication and other issues.

p2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems PART 1: Replication and other issues

p2p, Fall 06 2 Agenda για σήμερα 1. Περιγραφή των εργασιών του μαθήματος 2. Γενικά για Replication 3. Replication Theory for Unstructured (Cohen et al paper) 4. Epidemic Algorithms for Updates (Demers et al paper)

p2p, Fall 06 3 Term Projects  Εργασίες τριών τύπων  Έχουν κάποιο «ερευνητικό» χαρακτήρα – χρειάζεται να σκεφτείτε  Δεν υπάρχει μία λύση (άρα την ίδια εργασία παραπάνω από μια ομάδες)  Θα ήθελα 3 άτομα ανά ομάδα  Αν έχετε κάποια άλλη ιδέα – γίνεται αλλά όχι «αυτόματα»  Θα φτιάξετε μια web σελίδα για το project – την οποία θα μου στείλετε  replicate content and not index (for durability)!!!

p2p, Fall 06 4 Term Projects ΕΡΓΑΣΙΑ ΤΥΠΟΥ I ================= Θα επιλέξετε ένα άρθρο από μια λίστα από άρθρα Τα άρθρα αφορούν προβλήματα διαχείρισης δεδομένων είτε σε κεντρικοποιημένα συστήματα είτε σε κατανεμημένα συστήματα χωρίς τις ιδιότητες των συστημάτων ομοτίμων. Στόχος της εργασίας είναι η σχεδίαση μια εκδοχής του προβλήματος κατάλληλης για ένα σύστημα ομοτίμων κόμβων. Η εργασία σας θα πρέπει να περιέχει μια μορφή αξιολόγησης της προσέγγισής σας. Αυτή μπορεί να είναι θεωρητική (πχ, εκτίμηση πολυπλοκότητας της λύσης, απόδειξη της ορθότητας ή άλλων ιδιοτήτων (πχ εξισορρόπιση φορτίου) της λύσης) ή/και να περιλαμβάνει μια μικρή υλοποίηση. Θα παραδώσετε ένα άρθρο που θα έχει την μορφή ερευνητικής εργασίας (θα δοθούν οδηγίες). Επίσης, θα παρουσιάσετε την εργασία σας στο μάθημα (θα δοθούν οδηγίες).

p2p, Fall 06 5 Term Projects Άρθρα για τις Εργασίες Τύπου Ι [1-3] Διαλέξτε οποιοδήποτε (ένα) από τα sections 3, 4 ή 5 από το: M. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, A. Sah, J. Sidell C. Staelin and A. Yu. Mariposa: A Wide-Area Distributed Database System. VLDB J., 5(1), 1996, 48-63. [4] Εξετάστε πως το παρακάτω που συζητήσαμε στο μάθημα μπορεί να προσαρμοστεί για p2p: A. J. Demers, D. H. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. E. Sturgis, D. C. Swinehart, D. B. Terry: Epidemic Algorithms for Replicated Database Maintenance. PODC 1987: 1-12 [5] Θεωρείστε μια κατανεμημένη (p2p) εκδοχή ενός bitmap index. Για τα bitmap indexes μπορείτε να συμβουλευτείτε οποιοδήποτε βιβλίο βάσεων δεδομένων ή/και το παρακάτω P. E. O'Neil and D. Quass. Improved Query Performance with Variant Indexes. Proc. SIGMOD Conference, 1997, 38-49. [6] Εξετάστε πως το παρακάτω που αφορά sensor networks μπορεί να εφαρμοστεί σε p2p συστήματα D. Zeinalipour-Yazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V. Tsotras, M. Vlachos, N. Koudas, D. Srivastava The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks, DMSN Workshop, 2005.

p2p, Fall 06 6 Term Projects ΕΡΓΑΣΙΑ ΤΥΠΟΥ ΙΙ =================== Θα επιλέξετε ένα άρθρο που αφορά θέματα της περιοχής των συστημάτων ομότιμων κόμβων που δεν έχουμε καλύψει στο μάθημα, συγκεκριμένα: (i) security, (ii) trust/reputation, (iii) incentives, (iv) publish-subscribe συστήματα. Παρουσίαση του άρθρου στο μάθημα. (α) προτείνετε κάποια επέκταση του άρθρου, πχ εφαρμογή του σε άλλο τύπο overlay, βελτίωση κάποιου χαρακτηριστικού του κλπ. Σε αυτήν την περίπτωση, θα πρέπει να συμπεριλάβετε και κάποια μορφή αξιολόγησης της επέκτασης. Αυτή μπορεί να είναι θεωρητική (πχ, εκτίμηση πολυπλοκότητας της λύσης κλπ) ή/και να περιλαμβάνει μια μικρή υλοποίηση, είτε (β) να υλοποιήσετε ένα ικανοποιητικό κομμάτι του άρθρου. Θα παραδώσετε ένα άρθρο που θα έχει την μορφή ερευνητικής εργασίας (θα δοθούν οδηγίες). Επίσης, θα δώσετε μια δεύτερη παρουσίαση στο μάθημα αυτή τη φορά της εργασία σας (θα δοθούν οδηγίες).

p2p, Fall 06 7 Term Projects Άρθρα για τις Εργασίες Τύπου ΙI Security E. Sit and R. Morris: Security Considerations for Peer-to-Peer Distributed Hash Tables. IPTPS 2002: 261-269 D. S. Wallach: A Survey of Peer-to-Peer Security Issues. ISSS 2002: 42-57 Incentives M. Feldman, K. Lai, I. Stoica and J. Chuang: Robust incentive techniques for peer-to-peer networks. ACM Conference on Electronic Commerce 2004: 102-111 Trust/Reputation S. D. Kamvar, M. T. Schlosser, H. Garcia-Molina: The Eigentrust algorithm for reputation management in P2P networks. WWW 2003: 640-651 Publish/subscribe M. Bender, S. Michel, S. Parkitny, and G. Weikum A Comparative Study of Pub/Sub Methods in Structured P2P Networks. DBISP2P 2006, Seoul, South Korea, Springer, 2006

p2p, Fall 06 8 Term Projects ΕΡΓΑΣΙΑ ΤΥΠΟΥ ΙIΙ =================== Θα επιλέξετε ένα από τα συστήματα που αφορούν λογισμικό συστημάτων ομότιμων κόμβων. Θα πρέπει να εγκαταστήσετε το σχετικό λογισμικό και να κατασκευάσετε μια μικρή εφαρμογή. Θα παραδώσετε ένα άρθρο που θα περιλαμβάνει ένα σύντομο εγχειρίδιο για το σύστημα και μια περιγραφή της εφαρμογή σας. Επίσης, θα παρουσιάσετε την εργασία σας στο μάθημα (θα δοθούν οδηγίες). Η παρουσίαση θα πρέπει να περιλαμβάνει και ένα σύντομο demo.

p2p, Fall 06 9 Term Projects Τα Συστήματα για τις Εργασίες Τύπου IΙI [1] OpenDHT OpenDHT is a publicly accessible distributed hash table (DHT) service.OpenDHT [2] P2: Declarative Networking: P2 is a system which uses a high-level declarative language to express overlay networks in a highly compact and reusable formP2: Declarative Networking [3] PeerSim: PeerSim is a simulation environment for P2P protocols in java.PeerSim

p2p, Fall 06 10 Term Projects Προθεσμίες: Δεκ 7: Σχηματισμός ομάδων και επιλογή εργασίας Δεκ 14: 1-2 σελίδες "πρόταση εργασίας" (project proposal) (θα δοδούν οδηγίες) Δεκ 21: πιθανών να έχουμε μια μικρή παρουσίαση/συζήτηση των εργασιών την τελευταία εβδομάδα πριν τα Χριστούγεννα Ιαν 11: Παρουσιάσεις άρθρων Ομάδας ΙΙ Ιαν 18: " " Ιαν 25: Παράδοση Εργασίας (για το άρθρο, θα δοθούν οδηγίες) Θα υπάρχει ένα τελικό workshop που θα παρουσιαστούν οι εργασίες όλων των ομάδων.

p2p, Fall 06 11 Agenda για σήμερα 1.Περιγραφή των εργασιών του μαθήματος 2. Γενικά για Replication 3. Replication Theory for Unstructured (Cohen et al paper) 4. Epidemic Algorithms for Updates (Demers et al paper)

p2p, Fall 06 12 Types of Replication Two types of replication  Metadata/Index: replicate index entries  Data/Document replication: replicate the actual data (e.g., music files) Metadata vs Data (+) “Lighter” storage and bandwidth wise (+) Sizes of replicated objects more uniform (-) Adds an extra hop for actually getting the data (-) More frequent updates (-) Less durability/availability

p2p, Fall 06 13 Types of Replication Caching vs Replication Cache: Store data retrieved from a previous request (client- initiated) Replication: More proactive, a copy of a data item may be stored at a node even if the node has not requested it

p2p, Fall 06 14 Reasons for Replication Reasons for replication  Performance load balancing locality: place copies close to the requestor geographic locality (more choices for the next step in search) reduce number of hops  Availability In case of failures Peer departures

p2p, Fall 06 15 Reasons for Replication Besides storage, cost associated with replication: Consistency Maintenance Make reads faster in the expense of slower writes

p2p, Fall 06 16 No proactive replication (Gnutella) –Hosts store and serve only what they requested –A copy can be found only by probing a host with a copy Proactive replication of “keys” (= meta data + pointer) for search efficiency (FastTrack, DHTs) Proactive replication of “copies” – for search and download efficiency, anonymity. (Freenet)

p2p, Fall 06 17 Issues Which items (data/metadata) to replicate Based on popularity In traditional distributed systems, also rate of read/write cost benefit: the ratio: read-savings/write-increase Where to replicate (allocation schema)

p2p, Fall 06 18 Issues How/When to update Both data items and metadata

p2p, Fall 06 19 “Database-Flavored” Replication Control Protocols Lets assume the existence of a data item x with copies x 1, x 2, …, x n x: logical data item x i ’s: physical data items A replication control protocol is responsible for mapping each read/write on a logical data item (R(x)/W(x)) to a set of read/writes on a (possibly) proper subset of the physical data item copies of x

p2p, Fall 06 20 One Copy Serializability Correctness A DBMS for a replicated database should behave like a DBMS managing a one-copy (i.e., non-replicated) database insofar as users can tell One-copy serializable (1SR) the schedule of transactions on a replicated database be equivalent to a serial execution of those transactions on a one- copy database One-copy schedule: replace operation of data copies with operations on data items

p2p, Fall 06 21 ROWA Read One/Write All (ROWA) A replication control protocol that maps each read to only one copy of the item and each write to a set of writes on all physical data item copies. Even if one of the copies is unavailable an update transaction cannot terminate

p2p, Fall 06 22 Write-All-Available Write-all-available A replication control protocol that maps each read to only one copy of the item and each write to a set of writes on all available physical data item copies.

p2p, Fall 06 23 Quorum-Based Voting Read quorum V r and a write quorum V w to read or write a data item If a given data item has a total of V votes, the quorums have to obey the following rules: 1.V r + V w > V 2.V w > V/2 Rule 1 ensures that a data item is not read or written by two transactions concurrently (R/W) Rule 2 ensures that two write operations from two transactions cannot occur concurrently on the same data item (W/W)

p2p, Fall 06 24 Distributing Writes Immediate writes Deffered writes Access only one copy of the data item, it delays the distribution of writes to other sites until the transaction has terminated and is ready to commit. It maintains an intention list of deferred updates After the transaction terminates, it send the appropriate portion of the intention list to each site that contains replicated copies Optimizations – aborts cost less – may delay commitment – delays the detection of conflicts Primary or master copy Updates at a single copy per item

p2p, Fall 06 25 Eager vs Lazy Replication Eager replication: keeps all replicas synchronized by updating all replicas in a single transaction Lazy replication: asynchronously propagate replica updates to other nodes after the replicating transaction commits In p2p, lazy replication (or soft state)

p2p, Fall 06 26 Update Propagation Stateless or State-full (the “item-owners” know which nodes holds copies of the item) Who initiates the update:  Push by the server item (copy) that changes  Pull by the client holding the copy

p2p, Fall 06 27 Update Propagation When  Periodic  Immediate  Lazy: when an inconsistency is detected  Threshold-based: Freshness (e.g., number of updates or actual time) Value  Expiration-Time: Items expire (become invalid) after that time (most often used in p2p)  Adaptive periodic: Reduce or increase period based on the updates seen between two successive updates Stateless or State-full (the “item-owners” know which nodes holds copies of the item)

p2p, Fall 06 28 Path- length Neighbor state Total path latency Per-hop latency volume Multiple routes replicas Dimensions (d) O(dn 1/d )O(d)  --  - Realities (r)  O(r)  -  MAXPEERS (p) O(1/p)O(p)  O(p)*  Hash functions (k) --  -Ο(k)-O(k) RTT- weighted routing --  --- Uniform partitioning heuristic Reduced variance Reduces variance -- Reduced variance -- Summary: Design parameters and performance (CAN) * Only on replicated data

p2p, Fall 06 29 Replication Each node maintain a successor list of its r nearest successors  Upon failure, use the next successor in the list  Modify stabilize to fix the list CHORD: Failures Other nodes may attempt to send requests through the failed node Use alternate nodes found in the routing table of preceding nodes or in the successor list

p2p, Fall 06 30 A lookup fails, if all r nodes in the successor list fail. All fail with probability 2 -r (independent failures) = 1/N Theorem: If we use a successor list with r = Ο(logN) in an initially stable network and then every node fails with probability 1/2, then  with high probability, find_successor returns the closest living successor  the expected time to execute find_successor in the failed network is O(logN) CHORD: Failures

p2p, Fall 06 31 Store replicas of a key at the k nodes succeeding the key Successor list helps to keep the number of replicas per item known Other approach: store a copy per region CHORD: Replication

p2p, Fall 06 32 BATON: Failures Upon node departure or failure, the parent can reconstruct the entries Assume node x fails, any detected failures of x are reported to its parent y y regenerates the routing tables of x – Theorem 2 Messages are routed  Sideways (redundancy similar to CHORD)  Up-down (can find its parent through its neighbors) There is “routing” redundancy

p2p, Fall 06 33 Replication - Beehive  Proactive – model-driven replication  Passive (demand-driven) replication such as caching objects along a lookup path Hint for BATON Beehive The length of the average query path reduced by one when an object is proactively replicated at all nodes logically preceding that node on all query paths BATON Range queries Many paths to data Any ideas?

p2p, Fall 06 35 Replication Theory: Replica Allocation Policies in Unstructured P2P Systems E. Cohen and S. Shenker, “Replication Strategies in Unstructured Peer-to-Peer Networks”. SIGCOMM 2002 Q. Lv et al, “Search and Replication in Unstructured Peer-to- Peer Networks”, ICS’02 – Replication Part

p2p, Fall 06 36 Question: how to use replication to improve search efficiency in unstructured networks? Replication: Allocation Scheme How many copies of each object so that the search overhead for the object is minimized, assuming that the total amount of storage for objects in the network is fixed

p2p, Fall 06 37 Replication Theory - Model Assume m objects and n nodes Each node capacity ρ, total capacity R = n ρ How to allocate R among the m objects? Determine r i number of copies (distinct nodes) that hold a copy of i Σ i=1, m r i = R (R total capacity) Also, p i = r i /R – Fraction of total capacity allocated to I Allocation represented by the vector (p 1, p 2, …. p m ) = (r 1 /R, r 2 /R, r m /R)

p2p, Fall 06 38 Replication Theory - Model Assume that object i is requested with relative rates q i, we normalize it by setting Σ i=1, m q i = 1 For convenience, assume 1 << r i  n and that q 1  q 2  …  q m Map the query distribution q to an allocation vector p

p2p, Fall 06 39 Replication Theory - Model Assume all nodes equal capacity ρ, ρ = R/n R  m (at least one copy per item) m > ρ (else, the problem is trivial, maintain copies of all items everywhere) Bounds for p i At least one copy, r i  1, Lower value l = 1/R At most n copies, r i  n, Upper value, u = n/R

p2p, Fall 06 40 Replication Theory Assume that searches go on until a copy is found We want to determine r i that minimizes the average search size (number of nodes probed) to locate an item i Need to compute average search size per item Searches consist of randomly probing sites until the desired object is found: search at each step draws a node uniformly at random and asks whether it has a copy

p2p, Fall 06 41 Search Example 2 probes 4 probes

p2p, Fall 06 42 Replication Theory The probability Pr(k) that the object I is found at the k’th probe is given Pr(k) = Pr(not found in the previous k-1 probes) Pr(found in one (the kth) probe) = (1 – r i /n ) k-1 * r i /n k (search size: step at which the item is found) is a random variable with geometric distribution and θ = r i /n => expectation n/r i

p2p, Fall 06 43 Replication Theory A i : Expectation (average search size) for object i is the inverse of the fraction of sites that have replicas of the object A i = n/r i The average search size A of all the objects (average number of nodes probed per object query) A = Σ i q i A i = n Σ i q i /r i Minimize: A = n Σ i q i /r i

p2p, Fall 06 44 Replication Theory If we have no limit on r i, replicate everything everywhere Then, the average search size A i = n/r i = 1 Search becomes trivial How to allocate these R replicas among the m objects: how many replicas per object Assume a limit on R and that the average number of replicas per site ρ = R/n is fixed

p2p, Fall 06 45 Replication Theory Minimize: Σ i q i /p i Subject to Σp i = 1 and l  p i  u Monotonicity Since q 1  q 2  …  q m, we must have p 1  p 2  …  p m More copies to more popular, but how many?

p2p, Fall 06 46 Uniform Replication Create the same number of replicas for each object r i = R/m Average search size for uniform replication A i = n/r i = m/ρ A uniform = Σ i q i m/ρ = m/ρ (m n/R) Which is independent of the query distribution

p2p, Fall 06 47 Proportional Replication Create a number of replicas for each object proportional to the query rate r i = R q i It makes sense to allocate more copies to objects that are frequently queried, this should reduce the search size for the more popular objects

p2p, Fall 06 48 Proportional Replication Number of replicas for each object: r i = R q i Average search size for uniform replication A i = n/r i = n/R q i A proportioanl = Σ i q i n/R q i = m/ρ = A uniform again independent of the query distribution Why? Objects whose query rate are greater than average (>1/m) do better with proportional, and the other do better with uniform The weighted average balances out to be the same

p2p, Fall 06 49 Uniform and Proportional Replication Summary: Uniform Allocation: p i = 1/m Simple, resources are divided equally Proportional Allocation: p i = q i “Fair”, resources per item proportional to demand Reflects current P2P practices Example: 3 items, q 1 =1/2, q 2 =1/3, q 3 =1/6 UniformProportional

p2p, Fall 06 50 Space of Possible Allocations q i+1 /q i ? p i+1 /p i As the query rate decreases, how much does the ratio of allocated replicas behave Reasonable: p i+1 /p i  1 =1 for uniform So what is the optimal way to allocate replicas so that A is minimized?

p2p, Fall 06 51 Space of Possible Allocations Definition: Allocation p 1, p 2, p 3,…, p m is “in-between” Uniform and Proportional if for 1< i <m, q i+1 /q i < p i+1 /p i < 1 (=1 for uniform, = for proportial, we want to favor popular but not too much) Theorem1: All (strictly) in-between strategies are (strictly) better than Uniform and Proportional Theorem2: p is worse than Uniform/Proportional if for all i, p i+1 /p i > 1 (more popular gets less) OR for all i, q i+1 /q i > p i+1 /p i (less popular gets less than “fair share”) Proportional and Uniform are the worst “reasonable” strategies

p2p, Fall 06 52 q 2 /q 1 p 2 /p 1 Space of allocations on 2 items Worse than prop/uni More popular item gets less. Worse than prop/uni More popular gets more than its proportional share Better than prop/uni Uniform Proportional SR

p2p, Fall 06 53 So, what is the best strategy?

p2p, Fall 06 54 Square-Root Replication Find r i that minimizes A, A = Σ i q i A i = n Σ i q i /r i This is done for r i = λ √q i where λ = R/ Σ i √ q i Then the average search size is A optimal = 1/ρ ( Σ i √q i ) 2

p2p, Fall 06 55 How much can we gain by using SR ? Zipf-like query rates A uniform /A SR

p2p, Fall 06 56 Other Metrics: Discussion Utilization rate, the rate of requests that a replica of an object i receives U i = R q i /r i  For uniform replication, all objects have the same average search size, but replicas have utilization rates proportional to their query rates  Proportional replication achieves perfect load balancing with all replicas having the same utilization rate, but average search sizes vary with more popular objects having smaller average search sizes than less popular ones

p2p, Fall 06 57 Replication: Summary

p2p, Fall 06 58 Pareto Distribution (for the queries)

p2p, Fall 06 59 Pareto Distribution (for the queries) Both model Power-law distributions Zipf: what is the size (popularity) of the r-th ranked -- y ~ r -b Pareto: how many have size > r (look at the frequency distribution) P[X > x] ~ x -k P[X = x] ~ x -(k+1) = x -a "The r-th hottest item has n queries" is equivalent to saying "r items have n or more queries". This is exactly the definition of the Pareto distribution, except the x and y axes are flipped. Whereas for Zipf, we have r (rank) and compute n, in Pareto we have n and compute r (rank) Reference: http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html

p2p, Fall 06 60 Replication (summary) Each object i is replicated on r i nodes and the total number of objects stored is R, that is Σ i=1, m r i = R (1) Uniform: All objects are replicated at the same number of nodes r i = R/m (2) Proportional: The replication of an object is proportional to the query probability of the object r i  q i (3) Square-root: The replication of an object i is proportional to the square root of its query probability q i r i  √q i

p2p, Fall 06 61 What is the search size of a query ?  Soluble queries: number of probes until answer is found.  Insoluble queries: maximum search size  Query is soluble if there are sufficiently many copies of the item.  Query is insoluble if item is rare or non existent. Assumption that there is at least one copy per object

p2p, Fall 06 62 SR is best for soluble queries Uniform minimizes cost of insoluble queries OPT is a hybrid of Uniform and SR Tuned to balance cost of soluble and insoluble queries: uniformly allocate a minimum number of copies per item, use SR for the rest What is the optimal strategy?

p2p, Fall 06 63 UniformSR 10^4 items, Zipf-like w=1.5 All Soluble 85% Soluble All Insoluble

p2p, Fall 06 64 We now know what we need. How do we get there?

p2p, Fall 06 65 Replication Algorithms Fully distributed where peers communicate through random probes; minimal bookkeeping; and no more communication than what is needed for search. Converge to/obtain SR allocation when query rates remain steady. Uniform and Proportional are “easy” – Uniform: When item is created, replicate its key in a fixed number of hosts. – Proportional: for each query, replicate the key in a fixed number of hosts (need to know or estimate the query rate) Desired properties of algorithm:

p2p, Fall 06 66 Replication - Implementation Two strategies are popular Owner Replication When a search is successful, the object is stored at the requestor node only (used in Gnutella) Path Replication When a search succeeds, the object is stored at all nodes along the path from the requestor node to the provider node (used in Freenet) Following the reverse path back to the requestor

p2p, Fall 06 67 Achieving Square-Root Replication How can we achieve square-root replication in practice?  Assume that each query keeps track of the search size  Each time a query is finished the object is copied to a number of sites proportional to the number of probes On average object i will be replicated on c n/r i times each time a query is issued (for some constant c) It can be shown that this gives square root

p2p, Fall 06 68 Replication - Conclusion Thus, for Square-root replication an object should be replicated at a number of nodes that is proportional to the number of probes that the search required

p2p, Fall 06 69 Replication - Implementation If a p2p system uses k-walkers, the number of nodes between the requestor and the provider node is 1/k of the total nodes visited (number of probes) Then, path replication should result in square-root replication Problem: Tends to replicate nodes that are topologically along the same path

p2p, Fall 06 70 Replication - Implementation Random Replication When a search succeeds, we count the number of nodes on the path between the requestor and the provider Say p Then, randomly pick p of the nodes that the k walkers visited to replicate the object Harder to implement

p2p, Fall 06 71 Achieving Square-Root Replication What about replica deletion? Steady state: creation time equal with the deletion time The lifetime of replicas must be independent of object identity or query rate FIFO or random deletions is ok LRU or LFU no

p2p, Fall 06 72 Replication: Evaluation Study the three replication strategies in the Random graph network topology Simulation Details Place the m distinct objects randomly into the network Query generator generates queries according to a Poisson process at 5 queries/sec Zipf-distribution of queries among the m objects (with a = 1.2) For each query, the initiator is chosen randomly Then a 32-walker random walk with state keeping and checking every 4 steps Each sites stores at most objAllow (40) objects Random Deletion Warm-up period of 10,000 secs Snapshots every 2,000 query chunks

p2p, Fall 06 73 Replication: Evaluation For each replication strategy  What kind of replication ratio distribution does the strategy generate?  What is the average number of messages per node in a system using the strategy  What is the distribution of number of hops in a system using the strategy

p2p, Fall 06 74 Evaluation: Replication Ratio Both path and random replication generates replication ratios quite close to square-root of query rates

p2p, Fall 06 75 Evaluation: Messages Path replication and random replication reduces the overall message traffic by a factor of 3 to 4

p2p, Fall 06 76 Evaluation: Hops Much of the traffic reduction comes from reducing the number of hops Path and random, better than owner For example, queries that finish with 4 hops, 71% owner, 86% path, 91% random

p2p, Fall 06 77 Summary Random Search/replication Model: probes to “random” hosts Proportional allocation – current practice Uniform allocation – best for insoluble queries Soluble queries: Proportional and Uniform allocations are two extremes with same average performance Square-Root allocation minimizes Average Search Size OPT (all queries) lies between SR and Uniform SR/OPT allocation can be realized by simple algorithms.

p2p, Fall 06 78 Discussion Cohen et al paper Path replication overshoots or undershoot the fixed point if queries arrive in large bursts or time between search and subsequent copy generator is large – more involved algorithms than path replication Extensions for variable size issues or nodes with heterogeneous capacities Many issues: Other types of graphs, adaptability, etc …

p2p, Fall 06 80 Replication & Unstructured P2P epidemic algorithms

p2p, Fall 06 81 Replication Policy  How many copies  Where (owner, path, random path) Update Policy  Synchronous vs Asynchronous  Master Copy

p2p, Fall 06 82 Methods for spreading updates: Push: originate from the site where the update appeared To reach the sites that hold copies Pull: the sites holding copies contact the master site Expiration times Epidemics for spreading updates

p2p, Fall 06 83 Update at a single site Randomized algorithms for distributing updates and driving replicas towards consistency Ensure that the effect of every update is eventually reflected to all replicas: Sites become fully consistent only when all updating activity has stopped and the system has become quiescent Analogous to epidemics A. Demers et al, Epidemic Algorithms for Replicated Database Maintenance, SOSP 87

p2p, Fall 06 84 Methods for spreading updates: Direct mail : each new update is immediately mailed from its originating site to all other sites (+) Timely & reasonably efficient (-) Not all sites know all other sites (stateless) (-) Mails may be lost Anti-entropy : every site regularly chooses another site at random and by exchanging content resolves any differences between them (+) Extremely reliable but requires exchanging content and resolving updates (-) Propagates updates much more slowly than direct mail

p2p, Fall 06 85 Methods for spreading updates: Rumor mongering :  Sites are initially “ignorant”; when a site receives a new update it becomes a “hot rumor”  While a site holds a hot rumor, it periodically chooses another site at random and ensures that the other site has seen the update  When a site has tried to share a hot rumor with too many sites that have already seen it, the site stops treating the rumor as hot and retains the update without propagating it further Rumor cycles can be more frequent that anti-entropy cycles, because they require fewer resources at each site, but there is a chance that an update will not reach all sites

p2p, Fall 06 86 Anti-entropy and rumor spreading are examples of epidemic algorithms Three types of sites:  Infective: A site that holds an update that is willing to share is hold  Susceptible: A site that has not yet received an update  Removed: A site that has received an update but is no longer willing to share Anti-entropy: simple epidemic where all sites are always either infective or susceptible

p2p, Fall 06 87 A set S of n sites, each storing a copy of a database The database copy at site s  S is a time varying partial function s.ValueOf: K  {u:V x t :T} set of keys set of values set of timestamps (totally ordered by < V contains the element NIL s.ValueOf[k] = {NIL, t}: item with k has been deleted from the database Assume, just one item s.ValueOf  {u:V x t:T} thus, an ordered pair consisting of a value and a timestamp The first component may be NIL indicating that the item was deleted by the time indicated by the second component

p2p, Fall 06 88 The goal of the update distribution process is to drive the system towards  s, s’  S: s.ValueOf = s’.ValueOf Operation invoked to update the database Update[u:V] s.ValueOf {r, Now{})

p2p, Fall 06 89 Direct Mail At the site s where an update occurs: For each s’  S PostMail[to:s’, msg(“Update”, s.ValueOf) Each site s’ receiving the update message: (“Update”, (u, t)) If s’.ValueOf.t < t s’.ValueOf  (u, t)  The complete set S must be known to s (stateful server)  PostMail messages are queued so that the server is not delayed (asynchronous), but may fail when queues overflow or their destination are inaccessible for a long time  n (number of sites) messages per update  traffic proportional to n and the average distance between sites s originator of the update s’ receiver of the update

p2p, Fall 06 90 Anti-Entropy At each site s periodically execute: For some s’  S ResolveDifference[s, s’] Three ways to execute ResolveDifference : Push (sender (server) - driven) If s.Valueof.t > s’.Valueof.t s’.ValueOf  s.ValueOf Pull (receiver (client) – driven) If s.Valueof.t < s’.Valueof.t s.ValueOf  s’.ValueOf Push-Pull s.Valueof.t > s’.Valueof.t  s’.ValueOf  s.ValueOf s.Valueof.t < s’.Valueof.t  s.ValueOf  s’.ValueOf s  s’ s pushes its value to s’ s pulls s’ and gets s’ value

p2p, Fall 06 91 Anti-Entropy Assume that  Site s’ is chosen uniformly at random from the set S  Each site executes the anti-entropy algorithm once per period It can be proved that  An update will eventually infect the entire population  Starting from a single affected site, this can be achieved in time proportional to the log of the population size

p2p, Fall 06 92 Anti-Entropy Let p i be the probability of a site remaining susceptible (has not received the update) after the i cycle of anti-entropy For pull, A site remains susceptible after the i+1 cycle, if (a) it was susceptible after the i cycle and (b) it contacted a susceptible site in the i+1 cycle p i+1 = (p i ) 2 For push, A site remains susceptible after the i+1 cycle, if (a) it was susceptible after the i cycle and (b) no infectious site choose to contact in the i+1 cycle p i+1 = p i (1 – 1/n) n(1-p i ) 1 – 1/n (site is not contacted by a node) n(1-p i ) number of infectious nodes at cycle i Pull is preferable than push

p2p, Fall 06 93 Anti-Entropy More next week

P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems PART 1: Replication and other issues.

Παρόμοιες παρουσιάσεις

Παρουσίαση με θέμα: "P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems PART 1: Replication and other issues."— Μεταγράφημα παρουσίασης:

Παρόμοιες παρουσιάσεις

Σχετικά με το έργο

Σχόλια

Είσοδος

Σύνδεση μέσω των κοινωνικών δικτύων:

P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems PART 1: Replication and other issues.

Παρόμοιες παρουσιάσεις

Παρουσίαση με θέμα: "P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems PART 1: Replication and other issues."— Μεταγράφημα παρουσίασης:

Παρόμοιες παρουσιάσεις

Σχετικά με το έργο

Σχόλια