Presentation is loading. Please wait.

Presentation is loading. Please wait.

Algorithms for data streams Foundations of Data Science 2014 Indian Institute of Science Navin Goyal.

Similar presentations


Presentation on theme: "Algorithms for data streams Foundations of Data Science 2014 Indian Institute of Science Navin Goyal."— Presentation transcript:

1 Algorithms for data streams Foundations of Data Science 2014 Indian Institute of Science Navin Goyal

2 Introduction Data streams: Very large input data arriving sequentially, too large to fit in memory Examples: – networks (traffic passing through a router) – databases (transaction logs) – scientific data (satellites, sensors, LHC,…) – financial data What can we compute about the data in such situations? Today’s lecture: Start with an illustrative example problem, and then some generalities about the streaming model and problems

3 Example: Counting

4

5 Counting

6

7

8 Performance of Morris counter

9 Boosting the success probability I

10

11 Performance of Morris counter

12 Boosting the success probability II

13 Boosting success probability II

14 Test your understanding: Why don’t we just use the median all the time for boosting the probability of success instead of the mean?

15 Recap

16 Questions to ponder

17 Streaming data: models and problems

18 Models for streaming data

19

20 Restrictions on the algorithm

21 Some streaming problems: frequency moments

22 A general template for many streaming algorithms Come up with a basic random estimator for the quantity of interest (usually the non-trivial part) Give an efficient algorithm to compute the estimator (may need the use of hashing or some other way of reducing randomness requirements) Improve the probability of success by some trick such as the median of means estimator

23 Plan for next few lectures

24


Download ppt "Algorithms for data streams Foundations of Data Science 2014 Indian Institute of Science Navin Goyal."

Similar presentations


Ads by Google