Download presentation
Presentation is loading. Please wait.
Published bySimon Morris Modified over 9 years ago
1
© 2011 IBM Corporation 1 Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics JeffJonas@us.ibm.com January 18th, 2011
2
© 2011 IBM Corporation 2 The data will find the data … and the relevance will find you.
3
© 2011 IBM Corporation 3 My Background Early 80’s: Founded Systems Research & Development (SRD), a custom software consultancy 1989 – 2003: Built numerous systems for Las Vegas casinos including a technology known as Non-Obvious Relationship Awareness (NORA) 2005: IBM acquires SRD, now chief scientist of IBM Entity Analytics Personally designed and deployed +/- 100 systems, a number of which contained multi-billions of transactions describing 100’s of millions of entities Today: My focus is in the area of ‘sensemaking on streams’ with special attention towards privacy and civil liberties protections
4
© 2011 IBM Corporation 4 Sensemaking on Streams 1) Evaluate new information against previous information … as it arrives. 2) Determine if what is being observing is relevant. 3) Deliver this relevant, actionable insight fast enough to do something about it … as it’s happening. 4) Do this with sufficient accuracy and scale to really matter.
5
© 2011 IBM Corporation 5 Time Computing Power Growth Sensemaking Algorithms Available Observation Space Context Trend: Organizations Are Getting Dumber Your transactional data (inc. logs) Available reference data Plus, shared third party data And an avalanche of open source=
6
© 2011 IBM Corporation 6 Simply Overwhelming “Every two days now we create as much information as we did from the dawn of civilization up until 2003.” ~ Eric Schmidt, CEO Google
7
© 2011 IBM Corporation 7 Time Computing Power Growth Sensemaking Algorithms Available Observation Space Context Trend: Organizations Are Getting Dumber WHY?
8
© 2011 IBM Corporation 8 Algorithms at Dead End. You Can’t Squeeze Knowledge Out of a Pixel.
9
© 2011 IBM Corporation 9 scrila34@msn.com No Context
10
© 2011 IBM Corporation 10 Context, definition Better understanding something by taking into account the things around it.
11
© 2011 IBM Corporation 11 Information in Context … and Accumulating Top 200 Customer Job Applicant Identity Thief Criminal Investigation scrila34@msn.com
12
© 2011 IBM Corporation 12 From Pixels to Pictures to Insight Observations Contextualization Information in Context Relevance Consumer (An analyst, a system, the sensor itself, etc.)
13
© 2011 IBM Corporation 13 The Puzzle Metaphor Imagine an ever-growing pile of puzzle pieces of varying sizes, shapes and colors What it represents is unknown (there is no picture on hand) Is it one puzzle, 15 puzzles, or 1,500 different puzzles? Some pieces are duplicates, missing, incomplete, low quality, or have been misinterpreted Some pieces may even be professionally fabricated lies Point being: Until you take the pieces to the table and attempt assembly, you don’t know what you are dealing with
14
© 2011 IBM Corporation 14 How Context Accumulates With each new observation … one of three assertions are made: 1) Un-associated; 2) placed near like neighbors; or 3) connected Must favor the false negative New observations sometimes reverse earlier assertions Some observations produce novel discovery As the working space expands, computational effort increases Given sufficient observations, there can come a tipping point, at which time: 1) confidence begins to improve; and 2) computational effort begins to decrease!
15
© 2011 IBM Corporation 15 One Form of Context Is “Expert Counting” Is it 5 people each with 1 account … or is it 1 person with 5 accounts? Is it 20 cases of H1N1 in 20 cities … or one case reported 20 times? If one cannot count … one cannot estimate vector or velocity (direction and speed). Without vector and velocity … prediction is nearly impossible.
16
© 2011 IBM Corporation 16 Counting: Degrees of Difficulty Exactly Same Fuzzy Incompatible Features Deceit Bob Jones 123455 Bob Jones 123455 Bob Jones 123455 Robert T Jonnes 000123455 Bob Jones 123455 bjones@hotmail Bob Jones 123455 Ken Wells 550119
17
© 2011 IBM Corporation 17 “Key Features” Enable Expert Counting PeopleCarsRouter NameMakeDevice ID AddressModelMake Date of BirthYearModel PhoneLicense Plate No.Firmware Vers. PassportVINAsset ID NationalityOwnerEtc. BiometricEtc. Etc.
18
© 2011 IBM Corporation 18 Consider Lying Identical Twins #123 Sue 3/3/84 Uberstan Exp 2011 PASSPORT#123 Sue 3/3/84 Uberstan Exp 2011 PASSPORT Fingerprint DNA Most Trusted Authority “Same person – trust me.” Most Trusted Authority
19
© 2011 IBM Corporation 19 The same thing cannot be in two places … at the same time. Two different things cannot occupy the same space … at the same time.
20
© 2011 IBM Corporation 20 Space & Time Enables Absolute Disambiguation PeopleCarsRouter NameMakeDevice ID AddressModelMake Date of BirthYearModel PhoneLicense Plate No.Firmware Vers. PassportVINAsset ID NationalityOwnerEtc. BiometricEtc. Etc. WhenWhenWhen WhereWhereWhere
21
© 2011 IBM Corporation 21 “Life Arcs” Are Also Telling Bill Smith 4/13/67 Salem, Oregon Bill Smith 4/13/67 Seattle, Washington Address History Tampa, FL2008-2008 Biloxi, MS2005-2008 NY, NY1996-2005 Tampa, FL1984-1996 Address History San Diego, CA2005-2009 San Fran, CA2005-2005 Phoenix, AZ1990-2005 San Jose, CA1982-1990
22
© 2011 IBM Corporation 22 Space-Time-Travel
23
© 2011 IBM Corporation 23 Space-Time-Travel Cell phones are generating a staggering amount of geo-locational data – 600B transactions per day being created in the US alone This data is being “de-identified” and shared with third parties – in volume and in real-time Your movement quickly reveals where you spend your time (e.g., evenings vs. working hours) and who you spend your time with Re-identification (figuring out who is who) is somewhat trivial
24
© 2011 IBM Corporation 24 Analytic Superfood for Prediction Route suggestions pushed to drivers, just-in-time, to avert significant traffic events Search results optimized using personalized life arc forecasts A nation able to work right through an extreme global pandemic
25
© 2011 IBM Corporation 25 And Other Predictions … Prediction with 87% certainty where you will be next Thursday at 5:35pm Names of the top 10 people you co-locate with, not at home and not at work The Uberstan intelligence service preempts the next mass protest in real-time A political opponent is crushed and resigns two days after announcing their candidacy
26
© 2011 IBM Corporation 26 Consequences Space-time-travel data is the ultimate biometric It will enable enormous opportunity It will unravel one’s secrets It will challenge existing notions of privacy And, it’s here now and more to come
27
© 2011 IBM Corporation 27 Surveillance society is irresistible. And you are doing it. GPS-enhanced search, free email, Facebook, etc.
28
© 2011 IBM Corporation 28 Responsible innovation Privacy by design Better data protection Data anonymization, active audit logs, etc.
29
© 2011 IBM Corporation 29 Closing Thoughts
30
© 2011 IBM Corporation 30 Time Computing Power Growth Sensemaking Algorithms Available Observation Space Context Wish This On The Adversary
31
© 2011 IBM Corporation 31 Time Computing Power Growth Context Accumulation: The Way Forward Sensemaking Algorithms Available Observation Space Context Accumulation
32
© 2011 IBM Corporation 32 Geospatial-Enabled Intelligence... Today Geospatial Analytics Geospatial Visualization Current Focus
33
© 2011 IBM Corporation 33 Geospatial Visualization Geospatial Analytics Future Focus Geospatial-Enabled Intelligence … Tomorrow
34
© 2011 IBM Corporation 34 Big Data. New Physics. More Data: Better prediction – Less false positives – Less false negatives More Data: Bad data good More Data: Less compute effort
35
© 2011 IBM Corporation 35 Related Blog Posts Algorithms At Dead-End: Cannot Squeeze Knowledge Out Of A Pixel Puzzling: How Observations Are Accumulated Into Context Big Data. New Physics. Smart Sensemaking Systems, First and Foremost, Must be Expert Counting Systems Your Movements Speak for Themselves: Space-Time Travel Data is Analytic Super-Food! Big Data Flows vs. Wicked Leaks Data Finds Data “Macro Trends: The Privacy and Civil Liberties Consequences … and Comments on Responsible Innovation” – My DHS DPIAC Testimony, September 2008
36
© 2011 IBM Corporation 36 Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics JeffJonas@us.ibm.com January 18th, 2011
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.