Presentation is loading. Please wait.

Presentation is loading. Please wait.

Some Interesting Problems Rakesh Agrawal IBM Almaden Research Center.

Similar presentations


Presentation on theme: "Some Interesting Problems Rakesh Agrawal IBM Almaden Research Center."— Presentation transcript:

1 Some Interesting Problems Rakesh Agrawal IBM Almaden Research Center

2 Foundations What is data mining A collection of techniques? A set of composable operations (a la Relational Algebra)? Hints: Inductive Databases (Mannila) Relational Calculus + Statistical Quantifiers (Imielinski)

3 Privacy Implications Can we build accurate data models while preserving privacy of individual records? Hints Randomization (Agrawal & Srikant): Replace x by x+y where y is drawn from a known distribution Anonymization (Crypto literature)

4 Web Mining: Beyond Click Streams Mining knowledge bases from the web Completeness Accuracy Malicious Spam Hints: Brin’s Book experiment etc. etc.

5 Web Mining: Beyond hrefs What other social behaviors exist on the web and how to make use of them? Hints: Viral marketing paper in this conf etc. etc.

6 Actionable Patterns Principled use of domain knowledge for discarding uninteresting patterns performance Hints: Papers in the recent KDD conferences

7 Simultaneous mining over multiple data types Not just Relational tables Time series Textual documents But patterns across all of them

8 Some more problems Online, incremental algorithms over data streams When to retire the past data Long sequential patterns Discovering richer patterns (trees and dags) Automatic, data-dependent selection of algorithm parameters

9 What not to work on? The field is too young! Let every flower bloom!!! Too early to say we don’t need new algorithms Impressive results of the PVSM algorithm Emphasize evaluation and benchmarks Interesting research issues

10 Applications most likely to benefit from data mining Web applications (I think) Bioinformatics (I hope!)

11 Inhibitors Insufficient skill base (Education) Usability

12 The true delight is in the finding out, rather than in the knowing. Isaac Asimov


Download ppt "Some Interesting Problems Rakesh Agrawal IBM Almaden Research Center."

Similar presentations


Ads by Google