Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sequence CRDT: A Scalable Sequence Encoding for Massive Collaborative Editing Brice Nédelec, Pascal Molli & Achour Mostefaoui GDD – LINA – University of.

Similar presentations


Presentation on theme: "Sequence CRDT: A Scalable Sequence Encoding for Massive Collaborative Editing Brice Nédelec, Pascal Molli & Achour Mostefaoui GDD – LINA – University of."— Presentation transcript:

1 Sequence CRDT: A Scalable Sequence Encoding for Massive Collaborative Editing Brice Nédelec, Pascal Molli & Achour Mostefaoui GDD – LINA – University of Nantes Workshop on Highly-Scalable Distributed Systems Wednesday 14 January 2015, Paris France.

2 Distributed Collaborative Editors Distributed Collaborative Editors allow people to work distributed in space, time and organizations. Google Doc, Etherpad, Google Wave… 190M users on GDrive. (include Gdoc)

3 Google Doc is great, but... Single point of failure: If provider is down -> no collaboration Privacy, economic intelligence: What if google search for ANR on 15 October ;) ? Mass editing: Google has limitations on simultaneous users… (50), up to 50 -> just readers

4 Is it possible to build a fully decentralized editor that support 1M of simultaneous users? Why? Because it is hard ;) – “We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard.” Kennedy 1962 Because it can also be useful, mass collaboration -> Mooc, Webinars, events, Google Wave has already been used like that…

5 Distributed Collaborative Editors Principles (OT or CRDT) Based on optimistic replication algorithms – Operations are generated locally No lock, no communication with others sites – Broadcasted to others sites Every operation eventually derlivered – Re-executed when received System is correct if it ensures causality, convergence and “intention preservation” (OT definition) i.e. preserve partial orders in the sequence

6 Principles of Sequence CRDT Encode the order of the sequence in the Id of elements (remember ;) 10 LET B=A 15 For I=1 to LET A=A*A 21 NEXT I Arghh, I forgot LET B=B^2 before NEXT I, no way to use 20,5 ??

7 Insert alpha between p and q Create an id for alpha Create a disambiguator for alpha so path+dis unique) Space and time complexity of Sequence CRDT mainly decided here !!

8 Scientific problem Write an allocation strategy ID for sequence element that is independent of insertion order Many ways to type “QWERTY”, how to compute the smallest IDs for each character whatever insertion order ?

9 PB: Order of Insertions Typed: Q;W;E;R;T;YTyped: Y;T;R;E;W;Q

10

11

12

13

14

15

16

17 Combine Exponential tree & random allocation

18 LSEQ Complexities O((log n) 2) -> avoid to rebalance IDs…

19 Experiments We built the CRATE Editor 1 – LSEQ for ID allocation – Gossip for broadcast – Anti-entropy for missed delivery – interval version vectors for causal reception 2 2 M. Mukund, G. Shenoy R., S. Suresh, Optimized or-sets without ordering constraints, in: M. Chatterjee, J.-n. Cao, K. Kothapalli, S. Rajsbaum (Eds.), Distributed Computing and Networking, Vol of Lecture Notes in Computer Science, Springer Berlin Heidelberg, 2014, pp. 227{241. doi: / _15. 1 https://github.com/Chat-Wane/CRATE.git

20 1 st Setup Objective: Validate the space complexity analysis of LSEQ. – when the editing behaviour is monotonic, LSEQ has a polylogarithmic upper-bound on space complexity with respect to the number of insert operations. – When the editing behaviour is random, LSEQ has a logarithmic space complexity. Setup: – A single machine with 2 peers – Peers globally produce 166 char/s to create a doc of chars – Monotonic behavior

21 Evaluation

22 2 nd Setup Objective: Show that CRATE scales in terms of the number of peers. – In other words, the size of the network does not impact the space complexity upper bound of messages. Setup: – On GRID500, number of peers grows from 2 to 450, – 166 C/s uniformely distributed among peers

23

24 3rd Setup Objective: Show that concurrency does not negatively impact the size of identifiers. Hence, scenarios without concurrency show the upper- bound on the size of identifiers. Setup: – A single machine emulates 10 peers using the application CRATE. – char at 3 ins/s uniformly distributed among the peers – 5 runs with the approximate following latencies: 0: 02ms, 100ms, 500ms, 1s, and 10s.

25

26

27 Conclusions LSEQ allows to compute IDs for sequence CRDT with an upper bound to log(n) 2 The number of peers and concurrency do not impact negatively the performances of CRATE One million users is reachable… Nédelec, B., Molli, P., Mostefaoui, A., & Desmontils, E. (2013, September). LSEQ: an adaptive structure for sequences in distributed collaborative editing. In Proceedings of the 2013 ACM symposium on Document engineering (pp ). ACM. Nédelec, B., Molli, P., Mostefaoui, A., & Desmontils, E. (2013). Concurrency Effects Over Variable-size Identifiers in Distributed Collaborative Editing. In Proceedings of the International workshop on Document Changes: Modeling, Detection, Storage and Visualization, Florence, Italy, September 10, 2013 (Vol. 1008, pp. 0-7).

28 Perspectives Deploy a 1M editor on a network of browsers – 1M users – Editing 1M characters… And measures performances Under progress, nearly ready…


Download ppt "Sequence CRDT: A Scalable Sequence Encoding for Massive Collaborative Editing Brice Nédelec, Pascal Molli & Achour Mostefaoui GDD – LINA – University of."

Similar presentations


Ads by Google