Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

Similar presentations


Presentation on theme: "A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,"— Presentation transcript:

1 A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1, M.Sereno 2 MAMA WorkshopACM SIGMETRICS 2005 MAMA Workshop joint with ACM SIGMETRICS 2005 Banff, June 6-10, Politecnico di Torino, 2 Università di Torino Italy

2 MAMA Workshop, Sigmetrics ‘05 Outline  Motivation  Basic Model  Extended Model  Content Search  Download effects

3 MAMA Workshop, Sigmetrics ‘05 P2P System Architecture peers clients server  A possible definition Decentralized, self-organizing distributed systems, in which all or most communication is symmetric.

4 MAMA Workshop, Sigmetrics ‘05 Peer-to-Peer traffic  P2P is the single largest generator of traffic  P2P traffic significantly outweights web traffic  P2P traffic is continuing to grow

5 MAMA Workshop, Sigmetrics ‘05 P2P Applications  Communication  Voice Over IP: Skype  Instant Messaging  Distributed Computation  UnitedDevices, Distributed Science  File Sharing  BitTorrent, KaZaA, Gnutella, eDonkey, Napster, etc.  DHTs  Chord, CAN, Pastry, Tapestry  Wireless Ad hoc Networking

6 MAMA Workshop, Sigmetrics ‘05 Motivation  Most of the Internet traffic is generated by p2p applications.  Performance studies of p2p systems may be useful to drive the design of future applications.  Analytical models help analyzing large and complex p2p networks.

7 MAMA Workshop, Sigmetrics ‘05 Modeling techniques  Traditional Markov Models A detailed microscopic description is provided but with a huge space-state. It is computationally expensive to analyze large systems like p2p systems (with million of users and contents shared).  Fluid models Network dynamics are described with an increased level of abstraction, neglecting stochastic information. Scalability: the model is based on a set of differential equations invariant w.r.t. the size of the network (n.users, link cap)

8 MAMA Workshop, Sigmetrics ‘05 Model description [1]F. Clevenot, P. Nain, “A Simple Model for the Analysis of SQUIRREL”, Infocom 2004, Hong Kong, Mar [2]D. Qiu, R. Srikant, “Modeling and Performance Analysis of BitTorrent like Peer-to-Peer Networks”, Sigcomm 2004, U.S.A.  We model a generic p2p system without focusing on a particular implementation.  Based on a fluid approach like in [1] and [2], our model evolves in a second-order diffusion approximation where stochasticity in networks’ dynamics plays a relevant role.  The model provide a description of users/contents dynamics both in transient and in steady state.

9 MAMA Workshop, Sigmetrics ‘05 Model structure Users dynamics Contents dynamics Search phase Download phase

10 MAMA Workshop, Sigmetrics ‘05 Outline  Motivation  Basic Model  Extended Model  Content Search  Download effects 2

11 MAMA Workshop, Sigmetrics ‘05  The number of users joining the p2p network dynamically changes according to:  Enter-leave dynamics λ u = new users’ arrival rate 1/μ u = average subscription time  Active-Sleeping mode 1/μ as = average active time 1/μ sa = average sleeping time  Users in sleeping mode do not interact at all with the other users of the community. Users dynamics (1)

12 MAMA Workshop, Sigmetrics ‘05 Users dynamics (2) The evolution of the number of users in active or sleeping mode, U a and U s respectively, can be described by two fluid differential equations: sleeping users who become active new users active users who become sleeping active users who leave the system active users who become sleeping

13 MAMA Workshop, Sigmetrics ‘05 Content Dynamics The evolution of the number of available copies of a content is driven by 2 phenomena:  the generation of new copies (downloads or off-on transitions)  the cancellation of existing copies θ = average request rate 1/μ h, 1/μ’ h = average content holding time for active/sleeping users Note: p s = p s (μ’ h ) is the probability that sleeping users have the considered content when they become active.

14 MAMA Workshop, Sigmetrics ‘05 Brownian Motion  Content dynamics are modelled through a Second-Order Diffusion Approximation Each content is a particle with instantaneous position x(t) moving accordingly to a Brownian motion. Langevin equation Fokker Planck equation The evolution of the pdf f(x,t) over follows:

15 MAMA Workshop, Sigmetrics ‘05 Content diffusion equation Introduction of new contents in the system  A content can disappear when are no more copies available. The rate at which a content disappear is:  The pdf F(x,t) of the number of copies follows the F.P. equation with boundary conditions for :

16 MAMA Workshop, Sigmetrics ‘05 Diffusion Parameters h h = variation coefficient of holding time h r = variation coefficient of inter request time  m(x,t) expresses the average speed at which the content-particle moves along the x axis.  The variance σ 2 (x,t) expresses the burstiness of the processes.

17 MAMA Workshop, Sigmetrics ‘05 Case : Content disappearance (1)  In a single-content scenario we study the probability that the content disappears as a function of the users’ dynamics.  Active Users = 10  Sleeping Users = 10  Copies Availables = 1 Network parameters Initial condition  λ u = users’ arrival rate = 0.1 ut/s  1/μ u = avg subscription time = 4000 s  1/μ as = avg active period = 400 s  1/μ sa = avg sleeping period = 400 s  θ = average request rate  1/μ h,1/μ’ h = avg content holding time for a/s users= 100 s

18 MAMA Workshop, Sigmetrics ‘05 Case: Content disappearance (2) Che grafico facciamo vedere? Modello e simulatore michele a confronto? Solo Modello?

19 MAMA Workshop, Sigmetrics ‘05 Outline  Motivation  Basic Model  Extended Model  Content Search  Download effects 2

20 MAMA Workshop, Sigmetrics ‘05 Dual distribution  Relations between users’ and contents’ dynamics  The number of active and sleeping users at time t  The number of copies available at time t

21 MAMA Workshop, Sigmetrics ‘05 Dual equations  G a (x,t) and G s (x,t) are the pdf of the number of active and sleeping users having x contents: new users active users who become sleeping or leave the system sleeping users who become active

22 MAMA Workshop, Sigmetrics ‘05 Diffusion parameters  As for the contents diffusion equation m(x,t) expresses the average speed at which the copy-particle moves along the x axis, while σ 2 (x,t) expresses the variance of the associated process. r a = rate of generation of new copies d a/s = rate of cancellation of existing copies

23 MAMA Workshop, Sigmetrics ‘05 Multi-contents case (1)  In a multi-content scenario, still assuming ideal search and download we study the steady state distribution of the contents among users.  Active Users = 2500  Sleeping Users=7500  Copies Availables = 1 Network parameters Initial condition  λ u = users’ arrival rate = 0 ut/s  1/μ u = avg subscription time = inf  1/μ as = avg active period = 6 h  1/μ sa = avg sleeping period = 18 h  θ = average request rate = 2 c/h  λ c = contents’ introduction= 1/600 c/s  1/μ h,1/μ’ h = avg content holding time for a/s users= 10 h, 8 h

24 MAMA Workshop, Sigmetrics ‘05 Multi-contents case (2) Che grafici facciamo vedere? Modello e simulatore michele a confronto? Solo Modello?

25 MAMA Workshop, Sigmetrics ‘05 Outline  Motivation  Basic Model  Extended model  Content Search  Download effects 2

26 MAMA Workshop, Sigmetrics ‘05 The contents’ trasfer rate  In a non-ideal p2p system the transfer rate of the contents dynamically changes according to: p hit (x,t)  the probability of a successful search p hit (x,t) (related to content diffusion, search algorithm) p down (x,t)  the probability of a successful download p down (x,t) (related to network congestion, user impatience, on-off dynamics) The effective retrieval rate becomes:  Both search and download require to know F(x,t) and provide it as a function of time.

27 MAMA Workshop, Sigmetrics ‘05 Search Phase  Search algorithm  Search algorithm: flooding in an unstructured p2p network For each content request a query message is forwarded to all the neighbors up to the distance max_ttl  Graph Model  Graph Model The P2P network topology is modeled as a random finite graph. We consider Generalized Random Graph (GRG) to allow an arbitrary vertex degree distribution. Active peer Application-level connection

28 MAMA Workshop, Sigmetrics ‘05 GRG Model  Given the probability distribution {p k } that a vertex has k edges departing from it, we can define the generating function:  It can be shown that the generating function of the number of the first neighbors with a copy of the content is: α = x/U a X =#copies U a =#active users  The composition of these generating functions gives the generating function of the number of neighbors at distance h

29 MAMA Workshop, Sigmetrics ‘05 GRG Topology  To compute the pdf of the GRG nodes degree we adopt a M/M/∞ queue Assuming that an external observer joins the network # customers # connections established in queue by the observer  Now we can define the generating function for the number of neighbors at distance up to max_ttl that have a copy of the content: Hence it derives the hit probability:

30 MAMA Workshop, Sigmetrics ‘05 Outline  Motivation  Basic Model  Extended Model  Content Search  Download effects 2

31 MAMA Workshop, Sigmetrics ‘05 Download Phase  Assumptions  Assumptions:  The transport network is ideal  Infinite bandwidth on the client side  The peer from which downloading the desired content is rqndomly chosen between those storing that content. The dynamics of dowload at each peer are modelled by a M/G/1-PS queue. Problem Problem The download request rate incoming at peers is not known a priori! It depends on:  The contents’ distribution at peers  The policy used by the system to distribute the load among peers

32 MAMA Workshop, Sigmetrics ‘05 Probability of successful download (1)  Let θ is the popularity of a content, present in x copies in the network where there are U a active peers Download request rate  Assuming that the requests form a Poisson process, the queue becomes a M/G/1-PS with average delay:  Given a download rate y= θ s p hit the probability of successful download is: Single Content Case

33 MAMA Workshop, Sigmetrics ‘05 The overall probability of successful download is Multiple Content Case From F(x) we derive the probability that a peer has k contents, present in x copies: ( F(x) is the pdf of the number of copies available for the content ) The overall download request rate seen by a peer is Probability of successful download (2)

34 MAMA Workshop, Sigmetrics ‘05  Since all Z(x) are independent we can approximate the distribution of Y around its average with a normal distribution  The probability of successful download becomes  m y and σ y are the first two moments of Y  The integral is restricted to the interval for numerical reasons. Notes Probability of successful download (3)

35 MAMA Workshop, Sigmetrics ‘05 Conclusions We defined a stochastic fluid model of a p2p system able to describe users and contents dynamics both in transient and stationary regime. A support model permits to consider the effects of the search and the download on the system performance. Analytical solution of the equations in steady state Model Extension to classes of different users Model Extension to classes of different contents Comparison beetween model and simulations in realistic scenarios. Work in progress…

36 MAMA Workshop, Sigmetrics ‘05 Thank you!


Download ppt "A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,"

Similar presentations


Ads by Google