Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008.

Slides:



Advertisements
Similar presentations
1 Searching Internet of Sensors Junghoo (John) Cho (UCLA CS) Mark Hansen (UCLA Stat) John Heidemann (USC/ISI)
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Contextual Linking Architecture Christophe Blanchi June Corporation for National Research Initiatives Approved for.
Provenance-Aware Storage Systems Margo Seltzer April 29, 2005.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
PROVENANCE FOR THE CLOUD (USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES(FAST `10)) Kiran-Kumar Muniswamy-Reddy, Peter Macko, and Margo Seltzer Harvard.
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
Selecting Preservation Strategies for Web Archives Stephan Strodl, Andreas Rauber Department of Software.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
Chapter 1 Introduction to Databases
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
A Billiards Point of Sale Application Christopher Ulmer CS 470 Final Presentation.
Oracle iLearning/Tutor Integration Jan  Oracle iLearning Overview  Oracle Tutor Overview  Benefits of integration  Manual integration process.
Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012.
1 Large-scale Incremental Processing Using Distributed Transactions and Notifications Written By Daniel Peng and Frank Dabek Presented By Michael Over.
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Kien A. Hua Data Systems Lab Division of Computer Science University of Central Florida.
Beyond a Data Portal: A Collaborative Environment for the Deep Carbon Science Communities Han Wang, Yu Chen, Patrick West, John Erickson, Xiaogang Ma,
Sensor Data Management: Challenges and (some) Solutions Amol Deshpande, University of Maryland.
WPS Application Patterns at the Workshop “Models For Scientific Exploitation Of EO Data” ESRIN, October 2012 Albert Remke & Daniel Nüst 52°North Initiative.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Towards a Javascript CoG Kit Gregor von Laszewski Fugang Wang Marlon Pierce Gerald Guo
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
material assembled from the web pages at
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Scalable Metadata Definition Frameworks Raymond Plante NCSA/NVO Toward an International Virtual Observatory How do we encourage a smooth evolution of metadata.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
Sensor Database System Sultan Alhazmi
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
- Ahmad Al-Ghoul Data design. 2 learning Objectives Explain data design concepts and data structures Explain data design concepts and data structures.
MyActivity: A Cloud-Hosted Ontology-Based Framework for Human Activity Querying Amin BakhshandehAbkear Supervisor:
Directions for Hypertext Research: Exploring the Design Space for Interactive Scholarly Communication John J. Leggett & Frank M. Shipman Department of.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
ABSTRACT The JDBC (Java Database Connectivity) API is the industry standard for database- independent connectivity between the Java programming language.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS Instructor Ms. Arwa Binsaleh.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
Transforming video & photo collections into valuable resources John Waugaman President - Tygart Technology, Inc.
The Oxford-Google Digitization Project* Michael Popham Oxford Digital Library * Rules of commercial confidentiality apply to this presentation!
1.Research Motivation 2.Existing Techniques 3.Proposed Technique 4.Limitations 5.Conclusion.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
Copyright © Clifford Neuman - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE September Integrating Policy with Applications.
DuraCloud Open technologies and services for managing durable data in the cloud Michele Kimpton, CBO DuraSpace.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
CIMA and Semantic Interoperability for Networked Instruments and Sensors Donald F. (Rick) McMullen Pervasive Technology Labs at Indiana University
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Efficient Opportunistic Sensing using Mobile Collaborative Platform MOSDEN.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Chapter 1 Overview of Databases and Transaction Processing.
Database Principles: Fundamentals of Design, Implementation, and Management Chapter 1 The Database Approach.
Enhancements to Galaxy for delivering on NIH Commons
1 MANAGING THE DIGITAL INSTITUTION.
PowerPoint presentation
System Programming and administration
Joseph JaJa, Mike Smorul, and Sangchul Song
Digital Repositories The management of learning objects
NSDL Data Repository (NDR)
Outline Ganesan, D., Greenstein, B., Estrin, D., Heidemann, J., and Govindan, R. Multiresolution storage and search in sensor networks. Trans. Storage.
Network Coding for Wireless Sensor Network Storage
ACE – Auditing Control Environment
Presentation transcript:

Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Why Sensornet Provenance? Growing amount of sensornet data –In isolated sensornets? –Today, reuse of data and collaboration are rare Sharing is important –Use the Internet in sharing sensor data –multiple steps, different users Provenance for sensornet –Support tracking data back to its source –Encourage sharing 2

Sensor-Internet Goals –Share and search across many independently running sensor networks –Allow users to process and share transformed data 3 republisher: transforms the existing data the Internet sensor-search: index data and support sensornet discovery users mote sensornet mobile phones or personal computers sensors: sense the environments sensor store: repository for all data republisher [S. Reddy, G. Chen, B. Fulkerson, S. J. Kim, U. Park, N. Yau, J. Cho, M. Hansen, and J. Heidemann. Sensor-Internet Share and Search: Enabling Collaboration of Citizen Scientists. in Data Sharing and Interoperability on the World-wide Sensor Web, IPSN 2007, April 2007]

Sensor data sharing How Can Sensornet Provenance Help? 4 TempMap Interpolate point data into a complete temperature map 1. Check the transformation Problem: A user detects abnormality on the map Temperature Sensor Raw 87.1 ? ? Image Recognition Check the input Q. What causes problem? 3. Find an abnormal sensor reading 4. Check the transformation and input 5. Found that the image recognition problem (74.3  94.3) Fixing Digits Digit Repair Corrected 87.1 ± ±0.5 Raw 87.1 ? ?

Building Sensornet Ecosystem Collaborative processing –Encourage users who use the same data to collaborate –Participatory sensing Search over the provenance –Exploit the provenance to indentify high quality sensor data 5 TempMap Temperature Sensor Raw 87.1 ? ? Digit Repair

Challenges in Sensornet Provenance Sensor data are distributed across many data providers –Need: distributed data management and authorization Locate the distributed sensor data Support a distributed authorization in tracking provenance Each sensor data item is often small –Need: efficient provenance storage Scale the provenance storage according the sensor data size Sensor data keeps arriving –Need: stream-aware provenance Record the temporal location of stream 6

Sensor Provenance Goals and Contributions Goals –End-user can follow back to the original source –Observe each step of processing Contributions –Provenance via new linking scheme (distributed data management) –User-centric access control (distributed authorization) –Incremental compression (provenance storage) –Stream-aware provenance 7

Outline Motivation Sensornet Provenance Evaluation –Prototype deployment –Storage cost –Compression alternatives –Ease-of-use provenance 8

Design Choice of Sensornet Provenance Representation –annotation vs. inversion –content vs. link Granularity –tuple-level (fine-grained) vs. table-level Consistency (Stream-aware provenance) – timestamping to handle sensor data that keep arriving Authorization –The data generator controls data access –Pass a “letter of reference” to the owner 9

Predecessor Links Purpose: locate sensor data across different administration Fine-grained, annotation based, timestamped links –S–Source location Location of the source repository Table at that repository Search from the table –T–Timestamp To replay a relative query and produce the same result –T–Transformation A point to a general description, source codes, or executable programs An example –.–. 10 &x=" sb://sensorbase.org/soap/sensorbase2.wsdl?s=getData&a1="datetime,temperature"&a 2=p_97_temperature&a3=‘sensorid="sum-in"’&a4=0&a5=1 &t=" :00:00”

Letter of Reference Purpose: provide an ease-of-use authorization Sensor-store security model –Public –Case-by-case basis Letter of reference –Contextual information of the data requestor User’s activities : collaboration with others, data sharing activities How the user encountered the provider’s data Authentication –Provide this context to inform the data owner –The owner will make a decision based on it 11

Outline Motivation Sensornet Provenance Evaluation –Prototype deployment –Storage cost –Compression alternatives –Ease-of-use provenance 12

Prototype Deployment Deployment –Provenance system –Sensors –Sensor-store Prototype republishers –Digit repair –Digit repair with Image –TempMap 13 Fixing Digits Repair with image Corrected 87.1 ± ±0.5 Raw 87.1 ? ? Fixing Digits Digit Repair Corrected 87.1 ± ±0.5 Raw 87.1 ? ? TempMap Interpolate point data into a complete temperature map republishing Image Recognition West L.A. Temperature Publishing Raw 87.1 ? ? sensorbase republishing

Storage Alternatives Alternatives –copy source –uncompressed links –compressed links Small source, and data –Copying source works well –Uncompressed link is verbose, larger than data –With compression, cost equals copying source 14 Digit Repair (small source and republished data)

Benefits Depend the Size of Source Copying source is expensive when source is large Compressed link works well in all three cases 15 Repair with Image (large source and small republished data) TempMap (small source and large republished data)

Link Compression We showed that link compression is important, so what are the compression alternatives Compression Alternatives –no compression –per-link –Incremental Exploit redundancy across predecessor links 83% storage saving compared to no compression 16

Ease-of-use: Provenance Provenance extension –Sensorbase.org –predecessor links Easy source tracking –A simple click allow to track the source data 17 provenance a list of predecessor links source data provenance of the source data

Ease-of-use: Authorization Easy, user-centric, distributed access control 18 have an account? Yes No Generated a letter of reference (predecessor link, user account, target, user’s activities) If accessing source data requires an authentication

Conclusions Sensor republishing will become an important means to share sensor data New provenance for sensornet –Provenance via new linking scheme –Easy, user-centric, distributed access control –Compression makes the tuple-level provenance reasonable 19