Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jim Gray Microsoft Research

Similar presentations


Presentation on theme: "Jim Gray Microsoft Research"— Presentation transcript:

1 Jim Gray Microsoft Research
Data Management for Frontiers at the Interface Between Computing and Biology Jim Gray Microsoft Research

2 Cosmic Questions Where are we today? Where in 5 years?
What are the key questions? What am I doing next? What are the barriers? What hinders collaboration? What changes needed in education?

3 How much information is there?
Yotta Zetta Exa Peta Tera Giga Mega Kilo Everything! Recorded Soon everything can be recorded and indexed Most bytes will never be seen by humans. Human attention is the precious resource. Automatic: Capture, store, organize, analyze, summarize Manual visualize/iterate All Books MultiMedia All LoC books (words) .Movie A Photo A Book 24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9 nano, 6 micro, 3 milli

4 Plumbing Everything can be online Storage is nearing 1 K$/TeraByte,
Networking is 1$ / delivered GB Software is cheap or free Systems are becoming self-managing

5 Data Management Systems
Can ingest/store/search/analyze Tera Bytes Numbers Text Some progress on “objects” But semantics have to come from the domain Good science and engineering, but… Flopped in marketplace.

6 Basic Problems Data Acquisition: Data Ingest:
I do not much to say here Data Ingest: This is a huge problem Data Organization & Access This is what databases are good at for text & numbers For “semantic” data it requires domain –specific tools. Data Publication/ Discovery/ Interchange Requires good standards We have syntactic standards, Semantic standards are needed.

7 My #1 Problem Data Interchange (includes publication and discovery)
What does the data mean? The answer is: 42. Units? Precision? Accuracy? How was the number derived? How can you tell me what it means (without us talking on the phone or you visiting my laboratory) Need standard terminology, and standard formats. Hard to do for “new” stuff.

8 Great Hope & Promise XML is the answer
Reality: XML is one layer up from Unicode. Can describe structured information But not process, not meaning, not… Answer #2: Objects SOAP, Web Services,… Probably a better answer But… still needs tools to make it workable.

9 Discussion

10 Gifford’s List Data Interchange Scale: whats big
Quality: how do you keep it up DBs need more semantics.


Download ppt "Jim Gray Microsoft Research"

Similar presentations


Ads by Google