Download presentation

Presentation is loading. Please wait.

Published byErick Nadler Modified over 2 years ago

1
So Much Data Bernard Chazelle Princeton University Princeton University Bernard Chazelle Princeton University Princeton University So Little Time

2
So Many Slides Bernard Chazelle Princeton University Princeton University Bernard Chazelle Princeton University Princeton University So Little Time So Little Time (before lunch) (before lunch)

3
computation math experimentationalgorithms

4
Computers have two problems

5
1. They don’t have steering wheels

7
2. End of Moore’s Law party’s over !

8
computation algorithms experimentation

9
32 x 17 224 32 = 544 This is not me

10
FFT RSA

13
noisy low entropy uncertain unevenly priced big

14
noisy low entropy uncertain unevenly priced big

15
Biomedical imaging Sloan Digital Sky Survey 4 petabytes (~1MG) (~1MG) 10 petabytes/yr 150 petabytes/yr

16
Collected works of Micha Sharir My A(9,9)-th paper

17
massive input massive input output Sublinear Algorithms Sample tiny fraction

18
Shortest Paths [C-Liu-Magen ’03] New York DelphiDelphi

19
Ray Shooting Volume Intersection Point location

20
Approximate MST [C-Rubinfeld- Trevisan ’01]

21
Reduces to counting connected components

22
EE = no. connected components varvar << (no. connected components) 22 whp, is a good estimator of # connected components

23
worst case input space average case (uniform)

24
worst case

25
average case = actuarial view

26
“ OK, if you elect NOT to have the surgery, the insurance company offers 6 days and 7 nights in Barbados. “

27
arbitrary, unknown random source Self-Improving Algorithms

28
Yes ! This could be YOU, too !

29
E Tk Optimal expected time for random source time T1 time T2 time T3 time T4

30
Clustering [ Ailon-C-Liu-Comandur ’05 ] K-median over Hamming cube

31
minimize sum of distances

33
[ Kumar-Sabharwal-Sen ’04 ] COST OPT ( 1 + )

34
How to achieve linear limiting time? Input space {0,1} dndn prob < O(dn)/KSS Identify core Tail:Tail: Use KSS

35
Store sample of precomputed KSS Nearest neighbor Incremental algorithm

36
Main difficulty: How to spot the tail?

38
encode

39
decode

41
Data inaccessible before noise What makes you think it’s wrong?

42
Data inaccessible before noise must satisfy some property (eg, convex, bipartite) but does not quite

43
f(x) = ? x f(x) data f = access function

44
f(x) = ? x f(x) f = access function

45
f(x) = ? x f(x) But life being what it is…

46
f(x) = ? x f(x)

47
Humans Define distance from any object to data class

48
f(x) = ? x g(x) x 1, x 2,… f ( x 1), f ( x 2),… filter g is access function for:

49
Online Data Reconstructio n Online Data Reconstructio n

50
Monotone function: [n] R d Filter requires polylog (n) lookups [ Ailon-C-Liu-Comandur ’04 ] [ Ailon-C-Liu-Comandur ’04 ]

51
Convex polygon Filter requires : lookups [C-Comandur ’06 ]

52
Convex terrain lookups Filter requires :

53
Iterated planar separator theorem

55
Iterated (weak) planar separator theorem Iterated (weak) planar separator theorem in sublinear time!

56
Using epsilon-nets in spaces of unbounded VC dimension reconstruct

57
bipartite graph k-connectivity expander

58
denoising low-dim attractor sets

59
Priced computation & accuracy Priced computation & accuracy spectrometry/cloning/gene chip spectrometry/cloning/gene chip PCR/hybridization/chromatography PCR/hybridization/chromatography gel electrophoresis/blotting gel electrophoresis/blotting spectrometry/cloning/gene chip spectrometry/cloning/gene chip PCR/hybridization/chromatography PCR/hybridization/chromatography gel electrophoresis/blotting gel electrophoresis/blotting 0 1 0 0 10 0 11 1 0 1 0 1 01 1 0 0 1 0 0 01 1 1o 1 0 0 1 0 Linear programming Linear programming

60
Pricing data Pricing data Factoring is easy. Here’s why… Gaussian mixture sample: 00100101001001101010101 ….

61
Collaborators: Collaborators: Nir Ailon, Seshadri Comandur, Ding Liu Avner Magen, Ronitt Rubinfeld, Luca Trevisan Collaborators: Collaborators: Nir Ailon, Seshadri Comandur, Ding Liu Avner Magen, Ronitt Rubinfeld, Luca Trevisan

Similar presentations

OK

SLIDES FOR ORGANIZATION OF PORTFOLIOS You can use these slides as a reference to help you get your portfolio organized. The sample table of contents can.

SLIDES FOR ORGANIZATION OF PORTFOLIOS You can use these slides as a reference to help you get your portfolio organized. The sample table of contents can.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on environmental pollution Ppt on environment in hindi language Ppt on security features of atm cards Ppt on 98 notified sections of companies act 2013 Ppt on natural resources and conservation lesson Download ppt on transportation and communication Ppt on environmental protection Ppt on conservation of momentum problems Ppt on bluetooth applications for pc Ppt on odisha cultured