Data Structures in Ethereum

Slides:



Advertisements
Similar presentations
1 abstract containers hierarchical (1 to many) graph (many to many) first ith last sequence/linear (1 to 1) set.
Advertisements

Binary Trees, Binary Search Trees CMPS 2133 Spring 2008.
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Data Structures Michael J. Watts
Design a Data Structure Suppose you wanted to build a web search engine, a la Alta Vista (so you can search for “banana slugs” or “zyzzyvas”) index say.
Tirgul 8 Universal Hashing Remarks on Programming Exercise 1 Solution to question 2 in theoretical homework 2.
1 CS 430: Information Discovery Lecture 4 Data Structures for Information Retrieval.
1 abstract containers hierarchical (1 to many) graph (many to many) first ith last sequence/linear (1 to 1) set.
Data Structures & Algorithms Radix Search Richard Newman based on slides by S. Sahni and book by R. Sedgewick.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Different Tree Data Structures for Different Problems
Comp 249 Programming Methodology Chapter 15 Linked Data Structure - Part B Dr. Aiman Hanna Department of Computer Science & Software Engineering Concordia.
Compiled by: Dr. Mohammad Alhawarat BST, Priority Queue, Heaps - Heapsort CHAPTER 07.
Brought to you by Max (ICQ: TEL: ) February 5, 2005 Advanced Data Structures Introduction.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
12.1 Chapter 12: Indexing and Hashing Spring 2009 Sections , , Problems , 12.7, 12.8, 12.13, 12.15,
Outline Binary Trees Binary Search Tree Treaps. Binary Trees The empty set (null) is a binary tree A single node is a binary tree A node has a left child.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 9.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
Sets of Digital Data CSCI 2720 Fall 2005 Kraemer.
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
Advanced Data Structure By Kayman 21 Jan Outline Review of some data structures Array Linked List Sorted Array New stuff 3 of the most important.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
"Teachers open the door, but you must enter by yourself. "
COMP261 Lecture 23 B Trees.
Trees Chapter 15.
COMP 53 – Week Fourteen Trees.
Different Tree Data Structures for Different Problems
Data Structures Michael J. Watts
Tries 07/28/16 11:04 Text Compression
CPS216: Data-intensive Computing Systems
Week 7 - Friday CS221.
Top 50 Data Structures Interview Questions
Record Storage, File Organization, and Indexes
COMP 53 – Week Eleven Hashtables.
IP Routers – internal view
Mark Redekopp David Kempe
Extra: B+ Trees CS1: Java Programming Colorado State University
Lecture 18. Basics and types of Trees
Hashing Exercises.
Compsci 201 Priority Queues & Autocomplete
abstract containers sequence/linear (1 to 1) hierarchical (1 to many)
Radix search trie (RST) R-way trie (RT) De la Briandias trie (DLB)
Digital Search Trees & Binary Tries
Binary Trees, Binary Search Trees
Binary Tree and General Tree
Chapter 22 : Binary Trees, AVL Trees, and Priority Queues
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
Chapter 8 – Binary Search Tree
Trees Lecture 12 CS2110 – Spring 2018.
CS222P: Principles of Data Management Notes #6 Index Overview and ISAM Tree Index Instructor: Chen Li.
Part-D1 Priority Queues
Ch 6: Heapsort Ming-Te Chi
Data Structures and Algorithms
Bin Sort, Radix Sort, Sparse Arrays, and Stack-based Depth-First Search CSE 373, Copyright S. Tanimoto, 2002 Bin Sort, Radix.
Data Structures and Algorithms for Information Processing
Digital Search Trees & Binary Tries
Searching CLRS, Sections 9.1 – 9.3.
Binary Search Trees.
Database Design and Programming
Binary Trees, Binary Search Trees
Bin Sort, Radix Sort, Sparse Arrays, and Stack-based Depth-First Search CSE 373, Copyright S. Tanimoto, 2001 Bin Sort, Radix.
CS222/CS122C: Principles of Data Management Notes #6 Index Overview and ISAM Tree Index Instructor: Chen Li.
Heapsort Sorting in place
AVL-Trees (Part 1).
Bitcoin Data Structures
Tree A tree is a data structure in which each node is comprised of some data as well as node pointers to child nodes
Binary Trees, Binary Search Trees
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #05 Index Overview and ISAM Tree Index Instructor: Chen Li.
Presentation transcript:

Data Structures in Ethereum CS1951 L Spring 2019 14 February 2019 Maurice Herlihy Brown University

Review: Ethereum Storage 2256-1 A very, very big array Initially all 0s 32 bytes

Simple Variable declaration T v; value location v v’s slot

Fixed-Size Arrays declaration T[10] v; value location V[1] (v’s slot)+i*(sizeof T)

Dynamic Arrays declaration T[10] v; value location v[i] hash(v’s slot)+i*(sizeof T) v.length v’s slot

Mappings declaration Mapping(T1 => T2) v; value location v[i] hash(i.(v’s slot)) https://programtheblockchain.com/posts/2018/03/09/understanding-ethereum-smart-contract-storage/

To Learn More … https://programtheblockchain.com/posts/2018/03/09/understanding-ethereum-smart-contract-storage/

OK, We Lied to You We don’t really have an array with 2256 elements Instead we keep data structures in a … Key-value store: address:  value So we don’t have to store all those zeros

Key-Value Pairs do:0 dog:1 dax:2 dodo:3 cat:4 cats:5

Radix Trie a z b y do:0 dog:1 dax:2 dodo:3 cat:4 cats:5 One possible child for each letter Each path spells a key word Value at end of path Search time = key length

Radix Trie d c a o a x g d t o s do:0 dog:1 dax:2 dodo:3 cat:4 cats:5 x g d t 2 1 4 o s 3 5

Radix Trie Search for “dog” d c a o a x g d t o s do:0 dog:1 dax:2 cat:4 cats:5 a o a x g d t 2 1 4 o s 3 5

Radix Trie Search for “dog” d c a o a x g d t o s do:0 dog:1 dax:2 cat:4 cats:5 a o a x g d t 2 1 4 o s 3 5

Radix Trie Search for “dog” d c a o a x g d t o s do:0 dog:1 dax:2 cat:4 cats:5 a o a x g d t 2 1 4 o s 3 5

Radix Trie Search for “dog” d c a o a x g d t o s do:0 dog:1 dax:2 cat:4 cats:5 a o a x g d t 2 1 4 o s 3 5

Radix Trie Search for “dog” d c a o a x g d t o s dax:2 dodo:3 cat:4 cats:5 a o a x g d t 2 1 4 o s Search successful, bound to 1 3 5

Radix Trie Search for “ca” d c a o a x g d t o s do:0 dog:1 dax:2 cat:4 cats:5 a o a x g d t 2 1 4 o s 3 5

Radix Trie Search for “ca” d c a o a x g d t o s do:0 dog:1 dax:2 cat:4 cats:5 a o a x g d t 2 1 4 o s 3 5

Radix Trie Search for “ca” d c a o a x g d t o s do:0 dog:1 dax:2 cat:4 cats:5 a o a x g d t 2 1 4 o s 3 5

Radix Trie Search for “ca” d c a o a x g d t o s Search failed, do:0 dog:1 dax:2 dodo:3 cat:4 cats:5 a o a x g d t 2 1 4 o s 3 5 Search failed, Node contains no value

Radix Trie Search for “zoo” d c a o a x g d t o s do:0 dog:1 dax:2 cat:4 cats:5 a o a x g d t 2 1 4 o s 3 5

Radix Trie Search failed, No path Search for “zoo” d c a o a x g g d t do:0 dog:1 dax:2 dodo:3 cat:4 cats:5 a o a x g g d t 2 2 1 1 4 4 o s 3 3 5 5

Radix Trie d c One more optimization a o a x g d t 2 1 4 o s 3 5

Radix Trie d cat s a o x x g g do Combine degenerate edges One more optimization 4 s a o 5 x x g g do 2 1 3

Actual Ethereum Patricia Trie hash key value 1 2 3 4 10 11 12 13 14 15 data

BTW: Patricia Tries PATRICIA = "Practical Algorithm To Retrieve Information Coded In Alphanumeric" “Trie” is pronounced either try or tree

BTW: Recursive Length Prefix Binary data structure RLP Byte string

BTW: Recursive Length Prefix [ "cat", "dog" ] RLP 0xc8, 0x83,'c','a','t',0x83,'d','o','g'

Alternative Data Structures Patricia Merkle trees not only choice Find accounts Find storage contents by address

Efficient Operations 1 2 8 7 10 5 O(log n) searches, updates Even when adversary chooses keys!

Small Proofs of Inclusion h03 h01 h23 h0 h1 d1 Constant factors matter too

Small Proofs of Non-Inclusion Proof that radix trie does not contain key “ca” d c h1 a h2

Structural Integrity same content different shape 1 8 2 10 7 5 1 2 7 5 8 10 Data structure shape depends on contents only

Structural Integrity 1 8 2 10 7 5 1 Same hashes for Replication 2 7 5 8 10 Data structure shape depends on contents only

Structural Integrity 1 8 2 10 7 5 1 Same hashes for Replication 2 Concurrency 7 5 8 10 Data structure shape depends on contents only

Structural Integrity 1 8 2 10 7 5 1 Same hashes for Replication 2 Concurrency 7 Debugging 5 8 10 Data structure shape depends on contents only

Sharding 1 2 8 7 5 10

Sharding 1 2 8 7 5 10

Sharding 1 2 8 7 5 10 Parallelism: per-shard threads

Sharding 1 2 8 7 5 10 Parallelism: per-shard threads Load-balancing

Sharding 1 2 8 7 10 Parallelism: per-shard threads 5 Load-balancing Must minimize communication

Fork-Join Parallelism Integer fib(Integer n) { if (n > 2) { left = new Future( fib(n-1) ); right = fib(n-2); return left.get() + right; } else { return 1; }}}

Fork-Join Parallelism Integer fib(Integer n) { if (n > 2) { left = new Future( fib(n-1) ); right = fib(n-2); return left.get() + right; } else { return 1; }}} If convenient, create a thread to evaluate expression

Fork-Join Parallelism Integer fib(Integer n) { if (n > 2) { left = new Future( fib(n-1) ); right = fib(n-2); return left.get() + right; } else { return 1; }}} Wait for thread to join, or evaluate sequentially

Fork-Join Parallelism Efficient “Work-Stealing” schedulers Parallelism adapts to resources No locks, only fork + join

What Makes a Good Smart Contract Data Structure? Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Radix or Patricia Trie Constant in n Log in key length Efficient lookup against adversary Structural integrity Constant in n Log in key length Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Radix or Patricia Trie Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Radix or Patricia Trie ¯\_(ツ)_/¯ Asymptotically good Efficient lookup against adversary Structural integrity ¯\_(ツ)_/¯ Short proof of inclusion Short proof of non-inclusion Asymptotically good Constants suboptimal Sharding Parallelism

Radix or Patricia Trie ¯\_(ツ)_/¯ ¯\_(ツ)_/¯ Same as proof of inclusion Efficient lookup against adversary Structural integrity ¯\_(ツ)_/¯ Short proof of inclusion Short proof of non-inclusion ¯\_(ツ)_/¯ Sharding Same as proof of inclusion Parallelism

Each subtree on own node Radix or Patricia Trie Efficient lookup against adversary Structural integrity ¯\_(ツ)_/¯ Short proof of inclusion Short proof of non-inclusion ¯\_(ツ)_/¯ Sharding Parallelism Each subtree on own node

Seems pretty sequential Radix or Patricia Trie Efficient lookup against adversary Structural integrity ¯\_(ツ)_/¯ Short proof of inclusion Short proof of non-inclusion ¯\_(ツ)_/¯ Sharding Parallelism Seems pretty sequential

Review: Merkle Tree h03 =H(h01,h23) h01 =H(h0,h1) h23 =H(h2,h3) h0 =H(d0) d0 h1 =H(d1) d1 h2 =H(d2) d2 h3 =H(d3) d3

Review: Proof of d1 Inclusion h03 =H(h01,h23) h01 =H(h0,h1) h23 =H(h2,h3) h0 =H(d0) h1 =H(d1) h2 =H(d2) h3 =H(d3) d0 d1 d2 d3

Proof of d5 Non-Inclusion? h03 =H(h01,h23) Can’t prove a value not included h01 =H(h0,h1) h23 =H(h2,h3) Except by displaying entire tree h0 =H(d0) d0 h1 =H(d1) d1 h2 =H(d2) d2 h3 =H(d3) d3 55

Sorted Merkle Tree One leaf for each possible value Leaf is value or 

Sorted Merkle Tree for 00..11 h03 =H(h01,h23) h01 =H(h0,h1)  

Sorted Merkle Tree h03 =H(h01,h23) h01 =H(h0,h1) h23 =H(h2,h3) 00 00, 11 included   11

Sorted Merkle Tree h03 =H(h01,h23) h01 =H(h0,h1) h23 =H(h2,h3) 01, 10 not included h0 =H(00) h1 =H() h2 =H() h3 =H(11) 00   11

Proof of Inclusion for 00 h03 =H(h01,h23) h01 =H(h0,h1) h23 =H(h2,h3)   11

Proof of Non-Inclusion for 01 h03 =H(h01,h23) h01 =H(h0,h1) h23 =H(h2,h3) h0 =H(00) h1 =H() h2 =H() h3 =H(11) 00   11

Sparse Merkle Tree h03 =H(h01,h23) h01 =H(h0,h1) h23 =H(h2,h3) 00 h3 =H(11) 11 H()

Sparse Merkle Trees Most leaves are  H(), H(H()), ... can be precomputed Can truncate all all- subtrees Navigation vs re-hashing trade-offs

Sparse Merkle Tree Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Sparse Merkle Tree Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Sparse Merkle Tree Usual Merkle proof Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Usual Merkle proof Parallelism

Sparse Merkle Tree Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Sparse Merkle Tree Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Sparse Merkle Tree Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Binary Search Tree x keys < x keys > x Challenge: balance

Heap 1 2 2 4 3 27 9 Each key < children keys

Treap 1,K 2,J 2,L 4,B 3,H 27,M 9,A Both search tree & heap!

Treap Use key hash as priority? H(K),K H(J),J H(L),L H(B),B H(H),H H(M),M H(A),A Use key hash as priority?

Adding an Item Step1: add as if for search tree H(K),K H(J),J H(L),L H(B),B H(H),H H(M),M H(N),N H(A),A Step1: add as if for search tree

Rotations Step 2: rotate tree to restore heap y x right x c a y C A left a b b c A B B C Step 2: rotate tree to restore heap

Treap Linear depth if adverary controls priorities ¯\_(ツ)_/¯ Efficient lookup against adversary ¯\_(ツ)_/¯ Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Log-depth but only with high probability Parallelism

Treap ¯\_(ツ)_/¯ Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Treap ¯\_(ツ)_/¯ Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Treap ¯\_(ツ)_/¯ Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Treap ¯\_(ツ)_/¯ Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

Treap ¯\_(ツ)_/¯ Efficient lookup against adversary Structural integrity Short proof of inclusion Short proof of non-inclusion Sharding Parallelism

variables, arrays, mappings Ideas we covered in this lecture EVM storage model variables, arrays, mappings radix tries patrica tries sparse Merkle tree treaps