Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2011 IBM Corporation 1 Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM.

Similar presentations


Presentation on theme: "© 2011 IBM Corporation 1 Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM."— Presentation transcript:

1 © 2011 IBM Corporation 1 Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics JeffJonas@us.ibm.com January 18th, 2011

2 © 2011 IBM Corporation 2 The data will find the data … and the relevance will find you.

3 © 2011 IBM Corporation 3 My Background  Early 80’s: Founded Systems Research & Development (SRD), a custom software consultancy  1989 – 2003: Built numerous systems for Las Vegas casinos including a technology known as Non-Obvious Relationship Awareness (NORA)  2005: IBM acquires SRD, now chief scientist of IBM Entity Analytics  Personally designed and deployed +/- 100 systems, a number of which contained multi-billions of transactions describing 100’s of millions of entities  Today: My focus is in the area of ‘sensemaking on streams’ with special attention towards privacy and civil liberties protections

4 © 2011 IBM Corporation 4 Sensemaking on Streams 1) Evaluate new information against previous information … as it arrives. 2) Determine if what is being observing is relevant. 3) Deliver this relevant, actionable insight fast enough to do something about it … as it’s happening. 4) Do this with sufficient accuracy and scale to really matter.

5 © 2011 IBM Corporation 5 Time Computing Power Growth Sensemaking Algorithms Available Observation Space Context Trend: Organizations Are Getting Dumber Your transactional data (inc. logs) Available reference data Plus, shared third party data And an avalanche of open source=

6 © 2011 IBM Corporation 6 Simply Overwhelming “Every two days now we create as much information as we did from the dawn of civilization up until 2003.” ~ Eric Schmidt, CEO Google

7 © 2011 IBM Corporation 7 Time Computing Power Growth Sensemaking Algorithms Available Observation Space Context Trend: Organizations Are Getting Dumber WHY?

8 © 2011 IBM Corporation 8 Algorithms at Dead End. You Can’t Squeeze Knowledge Out of a Pixel.

9 © 2011 IBM Corporation 9 scrila34@msn.com No Context

10 © 2011 IBM Corporation 10 Context, definition Better understanding something by taking into account the things around it.

11 © 2011 IBM Corporation 11 Information in Context … and Accumulating Top 200 Customer Job Applicant Identity Thief Criminal Investigation scrila34@msn.com

12 © 2011 IBM Corporation 12 From Pixels to Pictures to Insight Observations Contextualization Information in Context Relevance Consumer (An analyst, a system, the sensor itself, etc.)

13 © 2011 IBM Corporation 13 The Puzzle Metaphor  Imagine an ever-growing pile of puzzle pieces of varying sizes, shapes and colors  What it represents is unknown (there is no picture on hand)  Is it one puzzle, 15 puzzles, or 1,500 different puzzles?  Some pieces are duplicates, missing, incomplete, low quality, or have been misinterpreted  Some pieces may even be professionally fabricated lies  Point being: Until you take the pieces to the table and attempt assembly, you don’t know what you are dealing with

14 © 2011 IBM Corporation 14 How Context Accumulates  With each new observation … one of three assertions are made: 1) Un-associated; 2) placed near like neighbors; or 3) connected  Must favor the false negative  New observations sometimes reverse earlier assertions  Some observations produce novel discovery  As the working space expands, computational effort increases  Given sufficient observations, there can come a tipping point, at which time: 1) confidence begins to improve; and 2) computational effort begins to decrease!

15 © 2011 IBM Corporation 15 One Form of Context Is “Expert Counting”  Is it 5 people each with 1 account … or is it 1 person with 5 accounts?  Is it 20 cases of H1N1 in 20 cities … or one case reported 20 times?  If one cannot count … one cannot estimate vector or velocity (direction and speed).  Without vector and velocity … prediction is nearly impossible.

16 © 2011 IBM Corporation 16 Counting: Degrees of Difficulty Exactly Same Fuzzy Incompatible Features Deceit Bob Jones 123455 Bob Jones 123455 Bob Jones 123455 Robert T Jonnes 000123455 Bob Jones 123455 bjones@hotmail Bob Jones 123455 Ken Wells 550119

17 © 2011 IBM Corporation 17 “Key Features” Enable Expert Counting PeopleCarsRouter NameMakeDevice ID AddressModelMake Date of BirthYearModel PhoneLicense Plate No.Firmware Vers. PassportVINAsset ID NationalityOwnerEtc. BiometricEtc. Etc.

18 © 2011 IBM Corporation 18 Consider Lying Identical Twins #123 Sue 3/3/84 Uberstan Exp 2011 PASSPORT#123 Sue 3/3/84 Uberstan Exp 2011 PASSPORT Fingerprint DNA Most Trusted Authority “Same person – trust me.” Most Trusted Authority

19 © 2011 IBM Corporation 19  The same thing cannot be in two places … at the same time.  Two different things cannot occupy the same space … at the same time.

20 © 2011 IBM Corporation 20 Space & Time Enables Absolute Disambiguation PeopleCarsRouter NameMakeDevice ID AddressModelMake Date of BirthYearModel PhoneLicense Plate No.Firmware Vers. PassportVINAsset ID NationalityOwnerEtc. BiometricEtc. Etc. WhenWhenWhen WhereWhereWhere

21 © 2011 IBM Corporation 21 “Life Arcs” Are Also Telling Bill Smith 4/13/67 Salem, Oregon Bill Smith 4/13/67 Seattle, Washington Address History Tampa, FL2008-2008 Biloxi, MS2005-2008 NY, NY1996-2005 Tampa, FL1984-1996 Address History San Diego, CA2005-2009 San Fran, CA2005-2005 Phoenix, AZ1990-2005 San Jose, CA1982-1990

22 © 2011 IBM Corporation 22 Space-Time-Travel

23 © 2011 IBM Corporation 23 Space-Time-Travel  Cell phones are generating a staggering amount of geo-locational data – 600B transactions per day being created in the US alone  This data is being “de-identified” and shared with third parties – in volume and in real-time  Your movement quickly reveals where you spend your time (e.g., evenings vs. working hours) and who you spend your time with  Re-identification (figuring out who is who) is somewhat trivial

24 © 2011 IBM Corporation 24 Analytic Superfood for Prediction  Route suggestions pushed to drivers, just-in-time, to avert significant traffic events  Search results optimized using personalized life arc forecasts  A nation able to work right through an extreme global pandemic

25 © 2011 IBM Corporation 25 And Other Predictions …  Prediction with 87% certainty where you will be next Thursday at 5:35pm  Names of the top 10 people you co-locate with, not at home and not at work  The Uberstan intelligence service preempts the next mass protest in real-time  A political opponent is crushed and resigns two days after announcing their candidacy

26 © 2011 IBM Corporation 26 Consequences  Space-time-travel data is the ultimate biometric  It will enable enormous opportunity  It will unravel one’s secrets  It will challenge existing notions of privacy  And, it’s here now and more to come

27 © 2011 IBM Corporation 27 Surveillance society is irresistible. And you are doing it. GPS-enhanced search, free email, Facebook, etc.

28 © 2011 IBM Corporation 28 Responsible innovation Privacy by design Better data protection Data anonymization, active audit logs, etc.

29 © 2011 IBM Corporation 29 Closing Thoughts

30 © 2011 IBM Corporation 30 Time Computing Power Growth Sensemaking Algorithms Available Observation Space Context Wish This On The Adversary

31 © 2011 IBM Corporation 31 Time Computing Power Growth Context Accumulation: The Way Forward Sensemaking Algorithms Available Observation Space Context Accumulation

32 © 2011 IBM Corporation 32 Geospatial-Enabled Intelligence... Today Geospatial Analytics Geospatial Visualization Current Focus

33 © 2011 IBM Corporation 33 Geospatial Visualization Geospatial Analytics Future Focus Geospatial-Enabled Intelligence … Tomorrow

34 © 2011 IBM Corporation 34 Big Data. New Physics.  More Data: Better prediction – Less false positives – Less false negatives  More Data: Bad data good  More Data: Less compute effort

35 © 2011 IBM Corporation 35 Related Blog Posts Algorithms At Dead-End: Cannot Squeeze Knowledge Out Of A Pixel Puzzling: How Observations Are Accumulated Into Context Big Data. New Physics. Smart Sensemaking Systems, First and Foremost, Must be Expert Counting Systems Your Movements Speak for Themselves: Space-Time Travel Data is Analytic Super-Food! Big Data Flows vs. Wicked Leaks Data Finds Data “Macro Trends: The Privacy and Civil Liberties Consequences … and Comments on Responsible Innovation” – My DHS DPIAC Testimony, September 2008

36 © 2011 IBM Corporation 36 Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics JeffJonas@us.ibm.com January 18th, 2011


Download ppt "© 2011 IBM Corporation 1 Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM."

Similar presentations


Ads by Google