Presentation is loading. Please wait.

Presentation is loading. Please wait.

Brice Nédelec, Pascal Molli & Achour Mostefaoui

Similar presentations


Presentation on theme: "Brice Nédelec, Pascal Molli & Achour Mostefaoui"— Presentation transcript:

1 Sequence CRDT: A Scalable Sequence Encoding for Massive Collaborative Editing
Brice Nédelec, Pascal Molli & Achour Mostefaoui GDD – LINA – University of Nantes Workshop on Highly-Scalable Distributed Systems Wednesday 14 January 2015, Paris France.

2 Distributed Collaborative Editors
Distributed Collaborative Editors allow people to work distributed in space, time and organizations. Google Doc, Etherpad, Google Wave… 190M users on GDrive. (include Gdoc)

3 Google Doc is great, but... Single point of failure:
If provider is down -> no collaboration Privacy, economic intelligence: What if google search for ANR on 15 October ;) ? Mass editing: Google has limitations on simultaneous users… (50), up to 50 -> just readers

4 Is it possible to build a fully decentralized editor that support 1M of simultaneous users?
Why? Because it is hard ;) “We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard.” Kennedy 1962 Because it can also be useful, mass collaboration -> Mooc, Webinars, events, Google Wave has already been used like that…

5 Distributed Collaborative Editors Principles (OT or CRDT)
Based on optimistic replication algorithms Operations are generated locally No lock, no communication with others sites Broadcasted to others sites Every operation eventually derlivered Re-executed when received System is correct if it ensures causality, convergence and “intention preservation” (OT definition) i.e. preserve partial orders in the sequence

6 Principles of Sequence CRDT
Encode the order of the sequence in the Id of elements (remember ;) 10 LET B=A 15 For I=1 to 27 20 LET A=A*A 21 NEXT I Arghh, I forgot LET B=B^2 before NEXT I, no way to use 20,5 ??

7 Insert alpha between p and q
Create an id for alpha Create a disambiguator for alpha so path+dis unique) Space and time complexity of Sequence CRDT mainly decided here !!

8 Scientific problem Write an allocation strategy ID for sequence element that is independent of insertion order Many ways to type “QWERTY”, how to compute the smallest IDs for each character whatever insertion order ?

9 PB: Order of Insertions
Typed: Q;W;E;R;T;Y Typed: Y;T;R;E;W;Q

10

11

12

13

14

15

16

17 Combine Exponential tree & random allocation

18 LSEQ Complexities O((log n)2) -> avoid to rebalance IDs…

19 Experiments We built the CRATE Editor1 LSEQ for ID allocation
Gossip for broadcast Anti-entropy for missed delivery interval version vectors for causal reception2 1https://github.com/Chat-Wane/CRATE.git 2M. Mukund, G. Shenoy R., S. Suresh, Optimized or-sets without ordering constraints, in: M. Chatterjee, J.-n. Cao, K. Kothapalli, S. Rajsbaum (Eds.), Distributed Computing and Networking, Vol of Lecture Notes in Computer Science, Springer Berlin Heidelberg, 2014, pp. 227{241. doi: / _15.

20 1st Setup Objective: Validate the space complexity analysis of LSEQ.
when the editing behaviour is monotonic, LSEQ has a polylogarithmic upper-bound on space complexity with respect to the number of insert operations. When the editing behaviour is random, LSEQ has a logarithmic space complexity. Setup: A single machine with 2 peers Peers globally produce 166 char/s to create a doc of chars Monotonic behavior

21 Evaluation

22 2nd Setup Objective: Show that CRATE scales in terms of the number of peers. In other words, the size of the network does not impact the space complexity upper bound of messages. Setup: On GRID500, number of peers grows from 2 to 450, 166 C/s uniformely distributed among peers

23

24 3rd Setup Objective: Show that concurrency does not negatively impact the size of identifiers. Hence, scenarios without concurrency show the upper-bound on the size of identifiers. Setup: A single machine emulates 10 peers using the application CRATE. 10000 char at 3 ins/s uniformly distributed among the peers 5 runs with the approximate following latencies: 0: 02ms , 100ms , 500ms , 1s , and 10s .

25

26

27 Conclusions LSEQ allows to compute IDs for sequence CRDT with an upper bound to log(n)2 The number of peers and concurrency do not impact negatively the performances of CRATE One million users is reachable… Nédelec, B., Molli, P., Mostefaoui, A., & Desmontils, E. (2013, September). LSEQ: an adaptive structure for sequences in distributed collaborative editing. In Proceedings of the 2013 ACM symposium on Document engineering (pp ). ACM. Nédelec, B., Molli, P., Mostefaoui, A., & Desmontils, E. (2013). Concurrency Effects Over Variable-size Identifiers in Distributed Collaborative Editing. In Proceedings of the International workshop on Document Changes: Modeling, Detection, Storage and Visualization, Florence, Italy, September 10, 2013 (Vol. 1008, pp. 0-7).

28 Perspectives Deploy a 1M editor on a network of browsers
1M users Editing 1M characters… And measures performances Under progress, nearly ready…


Download ppt "Brice Nédelec, Pascal Molli & Achour Mostefaoui"

Similar presentations


Ads by Google