Lower Bound for Partial Sums [Pătrașcu, Demaine 2004]
The partial sums problem Here’s a small problem: Textbook solution: “augmented” binary search trees running time: O(lg n) / operation Maintain an array A[n] under: update(i, Δ): A[i] += Δ query(i): return A[0] + … + A[i] + A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] + + A[6] + + update(2, Δ ) sum(6)
What kind of “lower bound”? Lower bounds you can trust.TM Model of computation ≈ real computers: memory words of w > lg n bits (pointers = words) random access to memory any operation on CPU registers (arithmetic, bitwise…) Just prove lower bound on # memory accesses bottleneck
Begin Proof
π time Maintain an array A[n] under: update(i, Δ): A[i] += Δ sum(i): return A[0] + … + A[i] Δ1 Δ2 Δ3 Δ4 Δ5 Δ6 The hard instance: π = random permutation for t = 1 to n: query: sum(π(t)) Δt= rand() update(π(t), Δt) Δ7 Δ8 Δ9 Δ10 Δ11 Δ12 Δ13 Δ14 Δ15 Δ16
Communication = # memory locations Δ1 Δ2 Δ3 Δ4 Δ5 Δ6 Δ7 Δ8 Δ9 Δ10 Δ11 Δ13 Δ14 Δ16 Δ17 Δ12 t = 9,…,12 How can Mac help PC run ? t = 5, …, 8 t = 9,…,12 Communication = # memory locations * read during * written during time
How much information needs to be transferred? Δ1 Δ2 Δ3 Δ4 Δ5 Δ13 Δ14 Δ16 Δ17 Δ8 Δ7 Δ9 Δ1+Δ5+Δ3 +Δ7+Δ2 Δ1 Δ1+Δ5+Δ3 How much information needs to be transferred? Δ1+Δ5+Δ3+Δ7 +Δ2 +Δ8 +Δ4 time At least Δ5 , Δ5+Δ7 , Δ5+Δ7+Δ8 => i.e. at least 3 words (random values incompressible)
The general principle Lower bound = # down arrows How many down arrows? (in expectation) (2k-1) ∙ Pr[ ] ∙ Pr[ ] = (2k-1) ∙ ½ ∙ ½ = Ω(k) k operations k operations
Communication between periods of k items = Ω(k) Recap yellow period pink period Communication = # memory locations * read during * written during Communication between periods of k items = Ω(k) yellow period pink period * read during * written during # memory locations = Ω(k)
Putting it all together aaaa Ω(n/8) Ω(n/4) Every load instruction counted once @ lowest_common_ancestor( , ) write time read time Ω(n/8) Ω(n/2) Ω(n/8) Ω(n/4) Ω(n/8) total Ω(n lg n) time
Q . E. D.