Presentation is loading. Please wait.

Presentation is loading. Please wait.

Review Lecture DB 15-829A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Review Lecture Databases Phil Gibbons May 1, 2003.

Similar presentations


Presentation on theme: "Review Lecture DB 15-829A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Review Lecture Databases Phil Gibbons May 1, 2003."— Presentation transcript:

1 Review Lecture DB 15-829A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Review Lecture Databases Phil Gibbons May 1, 2003

2 Review Lecture DB 05-01-03 2 DB Lecture #1 Topic: IrisNet System Overview Paper: “IrisNet: An Architecture for Compute-Intensive Wide- Area Sensor Network Services” – Nath, Deshpande, Ke, Gibbons, Karp, Seshan Highlights: First exposure to IrisNet, including a demo

3 Review Lecture DB 05-01-03 3 DB Lecture #2 Topics: Distributed Databases & IrisNet Query Processing Paper: “Cache-and-Query for Wide Area Sensor Databases” – Deshpande, Nath, Gibbons, Seshan Highlights: Distributed Databases: Homogeneous vs. Heterogeneous, Fragmentation, Replication, Data Transparency, Distributed Transactions & Two Phase Commit, Concurrency Control, Distributed Query Processing (Joins) Details on IrisNet’s data partitioning and query processing Good News: Have spent the entire the semester learning about IrisNet. No need to review???

4 Review Lecture DB 05-01-03 4 DB Lecture #3 Topics: Sensor Databases & Data Stream Systems Papers: “The Design of an Acquisitional Query Processor for Sensor Networks” – Madden, Franklin, Hellerstein, Hong “Models and Issues in Data Stream Systems” – Babcock, Babu, Datar, Motwani, Widom Highlights: Learned all about TinyDB. Compare issues and solutions of motes vs. internet-scale resource-rich sensing Overview of data streams models, algorithms, and systems. How Data Stream Management Systems differ from traditional DBMS.

5 Review Lecture DB 05-01-03 5 Acquisitional Query Processing How does the user control acquisition? Rates or lifetimes Event-based triggers How should the query be processed? Sampling as an operator, Power-optimal ordering Frequent events as joins Which nodes have relevant data? Semantic Routing Tree for effective pruning Nodes that are queried together route together Which samples should be transmitted? Pick most “valuable”? Adaptive transmission & sampling rates Adapted from slides ©Sam Madden

6 Review Lecture DB 05-01-03 6 Attribute Driven Query Propagation 123 4 [1,10] [7,15] [20,40] SELECT … WHERE a > 5 AND a < 12 Precomputed intervals = Semantic Routing Tree (SRT) Early pruning Adapted from slides ©Sam Madden

7 Review Lecture DB 05-01-03 7 Summary: DBMS versus DSMS Persistent relations One-time queries Random access “Unbounded” disk store Only current state matters Passive repository Relatively low update rate No real-time services Assume precise data Access plan determined by optimizer, physical DB design Transient streams Continuous queries Sequential access Bounded main memory History/arrival-order is critical Active stores Possibly multi-GB arrival rate Real-time requirements Data stale/imprecise Unpredictable/variable data arrival and characteristics Adapted from slides ©Rajeev Motwani

8 Review Lecture DB 05-01-03 8 DB Lecture #4 Topic: XML Query Processing Paper: “Relational Databases for Querying XML Documents: Limitations and Opportunities” – Shanmugasundaram, Tufte, He, Zhang, DeWitt, Naughton Highlights: Approaches for storing XML in a relational DB, and processing XML queries using a relational DBMS

9 Review Lecture DB 05-01-03 9 Motivation Recap XML for sensor systems + Good for rich, heterogeneous data + Supports on-the-fly schema changes + Good for hierarchical data + Standard data exchange format -Query processing is SLOW! -In contrast, relational DBMS are highly reliable, scalable, optimized for performance, & have advanced functionality Key research question: Can we store XML in a relational DB, and use a relational database system to process queries?

10 Review Lecture DB 05-01-03 10 Storing and Querying XML Documents Relational Database System XML Translation Layer XML Schema Relational Schema Translation Information XML Documents Tuples XML Query SQL Query Relational Result XML Result Adapted from slides ©Jayavel Shanmugasundaram

11 Review Lecture DB 05-01-03 11 Relational Schema Generation PurchaseOrder (id, customer) Date DayMonthYear Item (name, cost) Quantity Payment 1 ?11 ** 1 Minimize: Number of joins for simple path expressions (of form /a/b/c) Satisfy: Tables are normalized Adapted from slides ©Jayavel Shanmugasundaram

12 Review Lecture DB 05-01-03 12 Relational Schema Generation and XML Document Shredding Any XML Schema X can be mapped to a relational schema R, and … Any XML document XD conforming to X can be converted to tuples in R Further, XD can be recovered from the tuples in R What do you think of the approach, for IrisNet? Exercise: What would the Parking Space Finder relational schema look like? Would there be many or few joins in queries? Adapted from slides ©Jayavel Shanmugasundaram

13 Review Lecture DB 05-01-03 13 DB Lecture #5 Topics: XML Query Processing & Historical Synopses Papers: “Updating XML” – Tatarinov, Ives, Halevy, Weld “Maintaining Time-Decaying Stream Aggregates” – Cohen, Strauss Highlights: Mapping XML updates to Relational DBMS Synopses for Historical Queries: Non-decaying, Sliding window, More general decaying. Sampling, Distinct Value counting, Counting samples, Exponential histograms,

14 XML Data Tree bookdb book titleauthor name author name Nath Ke publisher IrisNet Stinks [publisher=“spi”] name Student Publishing, Inc [ID=“spi”] IDREF

15 Review Lecture DB 05-01-03 15 Semantic Issues in XML Updates IDs and IDREFs Can’t duplicate XML IDs Can’t leave dangling references Non-deterministic (ambiguous) updates There is more than one way (XPath expression) to get to an XML element Would like to detect it at compile time Adapted from slides ©Jayavel Shanmugasundaram

16 Review Lecture DB 05-01-03 16 Storing Historical Data View as a Data Stream of values Usually, not practical to store ALL past database values Want to limit the amount of storage used Often a detailed, exact answer is not interesting: Prefer summarized data (aggregates, samples) Prefer to focus primarily on recent data Suffices to get the leading digits of aggregates correct => Keys to staying within the memory limitations

17 Review Lecture DB 05-01-03 17 Sample Exam Questions IrisNet understanding: Describe a circumstance under which IrisNet’s query-routing approach fails to route a query to the lowest common ancestor (LCA) of the query answer IrisNet in context of distributed DB issues: Sketch what happens in a two phase commit protocol when there are no failures? Why isn’t this protocol needed for IrisNet’s Parking Space Finder service? TinyDB: What is a Semantic Routing Tree in TinyDB? Reason about TinyDB vs. IrisNet: Is acquisitional query processing worth incorporating into a sensor system, like IrisNet, where power is not an issue? Explain

18 Review Lecture DB 05-01-03 18 Sample Exam Questions XML vs. Relational tradeoffs XML on Relational DB issues: Briefly describe the Table-based insert algorithm in the Updating XML paper Data Streams & Historical Queries: Consider a data stream where the past well-represents the future. Briefly explain how query processing can be greatly simplified for such a stream


Download ppt "Review Lecture DB 15-829A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Review Lecture Databases Phil Gibbons May 1, 2003."

Similar presentations


Ads by Google