Maximally Recoverable Local Reconstruction Codes

Maximally Recoverable Local Reconstruction Codes
Sivakanth Gopi Venkatesan Guruswami Sergey Yekhanin

The Cloud How big is “The Cloud”? Distributed storage
≈ 0.5 zettabyte of data (1 ZB = TB) in gigantic data centers ready to be accessed ≈100$ to rent 1 TB for 1 year The costs run into ≈$50 billion per year! Distributed storage Data is partitioned and stored in individual servers with small storage capacity (few TB) Servers don’t respond (erasures) Server is rebooting or network bottlenecks or completely crashed Will refer to all of them as ‘erasures’

Classical erasure coding
𝑘,𝑘+ℎ -Reed-Solomon Code Pros Can recover from any ℎ erasures 1+ ℎ 𝑘 storage cost, 𝑘 is large to get good efficiency Field size is linear in number of servers Cons Need to read from 𝑘 servers even in the case of a single erasure

Locality To recover an erased server, only need to read from ‘𝑟’ other servers [Gopalan, Huang, Simitci, Yekhanin’11] Example: Has locality ‘2’ to recover any erased coordinate Locally Decodable Codes [Katz, Trevisan’00] Have locality even when a constant fraction of coordinates erased But they need a lot of redundancy and not practical for small length codes! Emphasize that locality is the bottleneck when user requests a small amount of data from an nonresponsive server but not the total bandwith

Separate normal and worst case
Efficiency Low storage overhead Reliability (in worst case) Should be able to recover from as many erasure patterns as possible, this can be slow Fast recovery (in normal case) Locality – recover crashed data by reading only a few other servers in case of a single crash Maximally Recoverable Local Reconstruction Codes (MR LRCs) are designed for this! Already deployed in Microsoft distributed storage systems outperforming traditional Reed-Solomon based systems

Maximally Recoverable LRCs
𝑎,ℎ - constants 𝑟= 𝑛 𝜖 (𝑛,𝑟,ℎ,𝑎,𝑞)-MR LRC is a linear code of length 𝑛 over 𝔽 𝑞 : ‘ℎ’ global parity checks ‘𝑎’ local parity checks per local group of size ‘𝑟’ If only ‘𝑎’ crashes in a local group, can recover the group locally Corrects every erasure pattern that is possible to correct!

Correctable erasure patterns
What are the correctable erasure patterns? ‘𝑎’ erasures per local group + ‘ℎ’ additional erasures anywhere “Beyond minimum distance” Minimum distance is ‘𝑎+ℎ’ Minimum distance achievable with 𝑞=𝑂(𝑛) [Tamo, Barg’14]

𝑛=12,𝑟=6,ℎ=2,𝑎=2,𝑞 −𝑀𝑅 𝐿𝑅𝐶

𝑛=12,𝑟=6,ℎ=2,𝑎=2,𝑞 −𝑀𝑅 𝐿𝑅𝐶 Can correct up to 𝑎=2 crashes in a local group, by reading any other 𝑟−𝑎=4 servers in the same group

𝑛=12,𝑟=6,ℎ=2,𝑎=2,𝑞 −𝑀𝑅 𝐿𝑅𝐶 Can correct 𝑎=2 erasures per local group + ℎ=2 more anywhere

Correctable erasure patterns
What are the correctable erasure patterns? ‘𝑎’ erasures per local group + ‘ℎ’ additional erasures anywhere Why are these the only correctable patterns and why do MR LRCs exist?

Parity check view of codes
A (𝑘,𝑛)-code can be expressed using the ‘ℓ=𝑛−𝑘’ linearly independent equations that the codewords satisfy An ℓ-erasure pattern is correctable, if the corresponding ℓ columns are full rank i.e. the corresponding ℓ×ℓ minor is non-zero

MR LRC parity check matrix
Some minors are identically zero (trivial minors) - uncorrectable Some minors are non-zero (non-trivial minors) - correctable

Non-trivial minors What are the non-trivial minors?
‘𝑎’ columns in each local group + ‘ℎ’ more columns Is there an assignment of 𝔽 𝑞 values to ∗ ′ 𝑠 such that all non- trivial minors are non-zero?

Does random work? All non-trivial minors should be non-zero
Assign the ∗’s randomly from 𝔽 𝑞 By Schwartz-Zippel + union bound all the required minors are non-zero with high probability if 𝑞≫ 𝑛 𝑎𝑛/𝑟+ℎ Large fields make encoding/decoding extremely slow! Ideally we want 𝑞=𝑂(𝑛) like Reed-Solomon codes or polynomial in 𝑛

A general class of questions
Given a matrix with 0’s and ∗’s, what is the smallest 𝑞 such that there is an assignment of 𝔽 𝑞 -values to ∗’s which makes all the non-trivial minors non-zero? GM-MDS: If all the 𝑘×𝑘 minors are non-trivial, one can take 𝑞=𝑂(𝑛) Proved by Lovett’18 and independently by Yildiz and Hassibi’18 If the 𝑛 coordinates are arranged in a matrix with row checks, column checks and one global parity check then need 𝑞≥ exp ( 𝑛 ) Kane, Lovett, Rao’17

Field size What is the smallest field size 𝑞 such that 𝑛,𝑟,ℎ,𝑎,𝑞 -MR LRCs exist? There are explicit constructions achieving 𝑞= 𝑛 𝑂 𝑎ℎ [GHJY’14, GYBS’17] Can we get 𝑞=𝑂 𝑛 ?

Our results Lower Bound
When 𝑔≥ℎ, the lower bound is 𝑞= Ω 𝑎,ℎ (𝑛⋅ 𝑟 min⁡(𝑎,ℎ−2) ) First super linear lower bound when 𝑟 is growing When 𝑎≈ℎ and 𝑟= 𝑛 Ω 1 , the lower bound is 𝑛 Ω ℎ and the best upper bounds are 𝑛 O ℎ 2 Lower Bound An 𝑛,𝑟,ℎ,𝑎,𝑞 -MR LRC with 𝑔=𝑛/𝑟 local groups needs 𝑞= Ω 𝑎,ℎ (𝑛⋅ 𝑟 𝛼 ) where 𝛼= min 𝑎,ℎ−2 ℎ/𝑔 ℎ/𝑔

Some upper bounds Practical deployments of MR LRCs typically have ℎ=2,3 When ℎ=2, the lower bound is Ω 𝑛 When ℎ=3, the lower bound is Ω(𝑛𝑟) Lower bound: 𝑞= Ω 𝑎,ℎ (𝑛⋅ 𝑟 min 𝑎,ℎ−2 ) Thm: There exist 𝑛,𝑟,ℎ=2,𝑎,𝑞 -MR LRCs with 𝑞=𝑂( 𝑛) for every 𝑎,𝑟 Thm: There exist 𝑛,𝑟,ℎ=3,𝑎,𝑞 -MR LRCs with 𝑞=𝑂( 𝑛 3 ) for every 𝑎,𝑟

Proof sketch of Lower Bound

Lower bound Lower Bound Special case
An 𝑛,𝑟,ℎ,𝑎,𝑞 -MR LRC needs 𝑞= Ω 𝑎,ℎ (𝑛⋅ 𝑟 min 𝑎,ℎ−2 ) Special case An 𝑛,𝑟,ℎ=3,𝑎=1,𝑞 -MR LRC needs 𝑞= Ω 𝑎,ℎ (𝑛⋅𝑟)

Parity check matrix One column per local group + any 3 more columns are linearly independent

Two claims Define 𝑉 𝑖 = 𝑣 𝑖 1 , 𝑣 𝑖 2 ,…, 𝑣 𝑖 𝑟 ⊂ 𝔽 𝑞 3
Define 𝐻 𝑖 ⊂ 𝔽 𝑞 3 as the difference set 𝐻 𝑖 = 𝑉 𝑖 − 𝑉 𝑖 ={ 𝑣 𝑖 𝑗 − 𝑣 𝑖 𝑘 :𝑗,𝑘∈ 𝑟 ,𝑗<𝑘} Claim 1: No two points in 𝐻 𝑖 are multiples of each other So we can think of 𝐻 𝑖 ⊂ ℙ𝔽 𝑞 2 and |𝐻 𝑖 |= 𝑟 2 Claim 2: If 𝑎∈ 𝐻 𝑖 , 𝑏∈ 𝐻 𝑗 ,𝑐∈ 𝐻 𝑘 where 𝑖,𝑗,𝑘 are distinct, then 𝑎,𝑏,𝑐 are linearly independent If we think of them as points in ℙ𝔽 𝑞 2 , then 𝑎,𝑏,𝑐 are non-collinear Also implies that 𝐻 𝑖 ’s are mutually disjoint

Proof of Claim 2 Want to show 𝑎∈ 𝐻 𝑖 ,𝑏∈ 𝐻 𝑗 , 𝑐∈ 𝐻 𝑘 are linearly independent Say 𝑎= 𝑣 𝑖 2 − 𝑣 𝑖 1 , 𝑏= 𝑣 𝑗 4 − 𝑣 𝑗 3 , 𝑐= 𝑣 𝑘 6 − 𝑣 𝑘 5 One column per local group + any 3 more columns are linearly independent Can prove claim 1, by making 4 erasures in a single group \

Claim 1 and 2 We have disjoint subsets 𝐻 1 , 𝐻 2 ,…, 𝐻 𝑛/𝑟 of the plane ℙ 𝔽 𝑞 each of size ≈ 𝑟 2 such that any three points in different subsets are non-collinear Want to show that 𝑞=Ω 𝑛𝑟 Suppose 𝑞≪𝑛𝑟, we will show that a random line 𝐿 will intersect 3 sets among 𝐻 1 ,…, 𝐻 𝑛 𝑟 w.h.p

Final step* Suppose 𝑞≪𝑛𝑟, we will show that a random line 𝐿 will intersect 3 sets among 𝐻 1 ,…, 𝐻 𝑛 𝑟 w.h.p Let 𝑍 𝑖 = 𝐿∩ 𝐻 𝑖 . 𝔼 𝑍 𝑖 ≈ 𝐻 𝑖 𝑞 𝔼 𝑍 𝑖 2 ≈ 𝐻 𝑖 𝑞 + 𝐻 𝑖 𝑞 2 Pr 𝑍 𝑖 >0 ≥ 𝔼 𝑍 𝑖 2 𝔼 𝑍 𝑖 2 ≈ 𝐻 𝑖 𝑞 So 𝐿 will intersect 𝐻 1 +…+ 𝐻 𝑛/𝑟 𝑞 ≈ 𝑛 𝑟 𝑟 2 𝑞 ≫1 sets among 𝐻 1 ,…, 𝐻 𝑛 𝑟 in expectation. *Simplified based on a suggestion of Madhu Sudan

Upper Bounds

A determinantal identity

Open Questions, a lot of them!
For fixed constant 𝑟,ℎ,𝑎, can we get 𝑛,𝑟,ℎ,𝑎,𝑞 -MRCs with 𝑞= 𝑂 𝑟,ℎ,𝑎 (𝑛)? The lower bound we show 𝑛≥ Ω ℎ,𝑎 (𝑛⋅ 𝑟 min ℎ−2,𝑎 ) The upper and lower bounds are still pretty far apart, when 𝑎≥ℎ−2 are constants Upper bound: 𝑞= 𝑛 𝑂(𝑎ℎ) Lower bound: 𝑞=𝑛⋅ 𝑟 ℎ−2 Thank you!

Maximally Recoverable Local Reconstruction Codes

Similar presentations

Presentation on theme: "Maximally Recoverable Local Reconstruction Codes"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Maximally Recoverable Local Reconstruction Codes

Similar presentations

Presentation on theme: "Maximally Recoverable Local Reconstruction Codes"— Presentation transcript:

Similar presentations

About project

Feedback