Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wrapup Amol Deshpande CMSC424. “Inventing the Future” Wednesday at 3:30pm 1115 CSIC Exam.

Similar presentations


Presentation on theme: "Wrapup Amol Deshpande CMSC424. “Inventing the Future” Wednesday at 3:30pm 1115 CSIC Exam."— Presentation transcript:

1 Wrapup Amol Deshpande CMSC424

2 “Inventing the Future” Wednesday at 3:30pm 1115 CSIC http://www.cs.umd.edu/projects/ITF/ Exam

3 DBMS at a glance Data Models  Conceptual representation of the data Data Retrieval  How to ask questions of the database  How to answer those questions Data Storage  How/where to store data, how to access it Data Integrity  Manage crashes, concurrency  Manage semantic inconsistencies Not fully disjoint categorization !!

4 DBMS at a glance Data Models  E/R Model, Relational model  Very simple and hence effective  Easy to make things complicated, very hard to keep them simple  No other data model has survived for so long  What is the future of XML ?

5 DBMS at a glance Data Retrieval  How to ask questions of the database  Declarative languages are great  Hide complexity from users, can optimize things, can evolve easily  SQL –More or less declarative  How to answer those questions  Parsing --> Optimization --> Processing  Operators: Hashing, sorting, joins, aggregation  Data structures –Hash indexes: Good for equality queries –Tree indexes: For everything else  Optimization: Complex, but key piece of a database system

6 DBMS at a glance Data Storage  How/where to store data, how to access it  Need to be cognizant of the memory hierarchy  Memory is cheap, disk is very expensive to access  Further disk is cheap to access sequentially, much more expensive to access randomly –Many of our decisions are influenced by this  RAID: Surviving failures  Accessing data: Indexes  What happens if a new form of storage comes along with different properties (say holographic storage ?)  We will need to rethink the tradeoffs, but we now know the approach

7 DBMS at a glance Data Integrity  Manage crashes, concurrency  Transactions, 2-phase locking  Write-ahead logging  DBMS pretty much the last word on concurrency/recovery  OSs don’t come close to supporting anything like that  Manage semantic inconsistencies  Normalization, FDs  Not easy to identify tools, but we have learned how to think about them –Try to capture them in the E/R diagram as much as possible

8 Motivation: Data Overload We began the first lecture with discussing the data overload  Huge amounts of data generated every day  Much faster than our ability to process it  Increasing ability to capture more enterprise data  Web, blogs, RSS Feeds etc  Multimedia –Flickr and cellphone cameras has led a revolution in how people take pictures –Videos will be next –Not hard to imagine capturing every moment of your life  Sensor/RFID data –Tiny sensors/RFID just beginning to become ubiquitous –Billions of these generating a tiny amount of data every second is still too much  Biological/Scientific data

9 Motivation: Data Overload Relational databases help for structured data  But increasingly not sufficient  The things we want to do with data can’t be expressed in SQL  E.g. with biological data, web  Too much unstructured data  Distributed data generation creates additional headaches  Almost impossible to try to collect the data in one location Making sense of this requires not only advances in data processing, but also in data understanding/mining  Interdisciplinary efforts

10 Some Lessons from RDBMS But can use the lessons learned from developing RDBMS  Data independence / abstraction is good  Hide details, even if initially it leads to inefficiency  Look for structure  Every seemingly highly unstructured data might have structure  Look for patterns in usage  Relational database are fast because query processing is predictable –Unlike say OS workloads which are very hard to optimize for  If you can identify patterns, you can probably optimize them  Declarative languages are great  Say what you want, not how to get it


Download ppt "Wrapup Amol Deshpande CMSC424. “Inventing the Future” Wednesday at 3:30pm 1115 CSIC Exam."

Similar presentations


Ads by Google