Presentation is loading. Please wait.

Presentation is loading. Please wait.

Open Data – reflections from behind the Big Firewall Or, may you be cursed to live in interesting times.

Similar presentations


Presentation on theme: "Open Data – reflections from behind the Big Firewall Or, may you be cursed to live in interesting times."— Presentation transcript:

1 Open Data – reflections from behind the Big Firewall Or, may you be cursed to live in interesting times

2 Open Data …. Why bother? Open Contributed Content will become a core, strategic, economic resource – and the most accessible & scalable resource we possess. Mobility, Openness & Connection will matter more than Presence & Rigid Structures In 2013 expect generation of >850 Exabytes of Internet data. Mostly user contributed content (versus traditional enterprise sources). In 2013 expect generation of >850 Exabytes of Internet data. Mostly user contributed content (versus traditional enterprise sources). Global access to technology is already driving trends like ‘virtual citizenship’, ‘virtual employment’ & ‘social innovation’ On-demand interaction will increasingly be the norm for a global community of virtual innovators … who expect their user experience to be as simple as ‘using an appliance’

3 Open Data and Economics or …. ‘Greater Fool Investing’ …..!!’ Open data is a potential new 'raw material' for economic growth. It requires effort to produce and maintain. Unlike traditional raw materials like oil, gas and minerals, its value increases fastest when it is open and shareable. Bubble … "trade in high volumes at prices that are considerably at variance with intrinsic values". Open Data alone does not generate direct economic benefit sufficient to offset production & operational costs … the question is … can it generate sufficient ‘value’ to be sustainable? Incentives must be in place to sustain “economically significant” amounts of Open Data Some bright lights … but we need answers before we run out of steam!!

4 How Private is Private? Privacy is not absolute, it is a balance between Risk and Utility Open Data usage is inherently contradictory Social media usage -> Maximize Utility + (Largely) Ignore Risk Enterprise usage -> Maximize Utility + Minimize Risk Who carries liability in case of dispute? Uncertainty in usage policies is a substantial form of business risk Recognize in policy and legislation that privacy is mutable - based on context ✔ Available Open Data useful to identify & characterize group behaviors ✖ Negative usage for ‘nuisance’ providers to identify high-value targets { ∃ (high value residences)} ∩ { ∃ (long emergency response time)} ∩ { ∃ (many local area crimes)}  {area where people might buy home security products} (all available on open data sites near you ) { ∃ (high value residences)} ∩ { ∃ (long emergency response time)} ∩ { ∃ (many local area crimes)}  {area where people might buy home security products} (all available on open data sites near you )

5 A Fun Use Case

6 Challenges for Privacy in an Open Data World And I haven’t even mentioned Trust, Provenance, Security, ……

7 Data – 100’s of datasets, 1000’s of files – Very open domain(s) – Very expensive to normalize – Scaling complexity from high dimensionality Approach – Pay-as-you go approach, only process what you need – Do not stick to a common model, use any you can find – Generate interesting views and feed them to “analytics” Lessons learned – Multiple models, depending on context – Need to do things incrementally – Lightweight generally better than heavyweight Selected research results: -Live deployment in Dublin -Won prize in Semantic Web Challenge -Paper at ISWC -Paper at Hypertext -Invited paper at Journal of Web Semantics Selected research results: -Live deployment in Dublin -Won prize in Semantic Web Challenge -Paper at ISWC -Paper at Hypertext -Invited paper at Journal of Web Semantics Research impact: what we have learned so far There are plenty of interesting challenges!! Documents + Metadata StructureEntities LinksViews Insight …. Pay-as-you-go, Gain-as-you-go

8 Dublinked - Towards a robust test-bed for Open Data Research IBM Connections Social Media & Collaboration IBM Connections Social Media & Collaboration IBM IOC Interaction with Industry Solutions IBM IOC Interaction with Industry Solutions Dublin City Enterprise Platform IBM Enterprise Cloud Scalable compute, storage & network infrastructure IBM Enterprise Cloud Scalable compute, storage & network infrastructure Provider 1…N Open REST Web Services API Catalog & Navigation Search & Query Privacy & Security Knowledge Representation & Reasoning Publication & Annotation Visualization & Analytics Enterprise Citizen IBM Products & Services Robust models to organize and represent resources and their context Scalable privacy and security of resources Automated assimilation and sharing of resources Scalable privacy and security of resources Automated assimilation and sharing of resources Compose resources for development, mash-up & visualization Challenges include.. IBM Research Partners & People Key Represent knowledge efficiently for continuous machine reasoning and diagnosis Research Testbed

9 What we do: Learning Systems to Help Diagnose the City Problem How can we provide City decision makers with explanations and diagnoses for events by applying machine reasoning techniques to a fusion of massive, rich, complex and dynamic data? How can we move from explanation to prediction? Challenges Identifying relevant data and information Capturing and representing anomalies Correlating knowledge on heterogeneous data sources Advanced fusion of heterogeneous data from multiple sources Goals Identification of the nature and cause of changes Explaining logical connection of knowledge across space and time Move from explanation to prediction Anomaly Detected: Delayed buses, congested roads Anomaly Detected: Delayed buses, congested roads Detection to Diagnosis?

10 Outline Research Roadmap Use Cases Technology Provenance Privacy High-volume distributed querying Wide-scale distributed querying Distributed Entity Linking Fine-grain Access Control Streaming Analytics Distributed Reasoning Context Mining Lightweight Distributed Information Access Contextual Access Basic Access Control Distributed Entity Consolidation Graph Access Linked Data Cloud Context Retrieval Cross-agency Context Retrieval Cross-agency Analytics Cross Web-Enterprise Analytics Many-agency Analytics Public Safety Integrator Life analytics (social/health/public safety) High-risk/time-critical alerting Cross-agency Alerting Data Warehouse Dynamic Distributed Information Analytics


Download ppt "Open Data – reflections from behind the Big Firewall Or, may you be cursed to live in interesting times."

Similar presentations


Ads by Google