Semantic Data Caching and Replacement. Outline Motivation Client Caching Architecture Model of Semantic Caching Simulations and Results Conclusion and.

Slides:



Advertisements
Similar presentations
1 Processamento de Consultas Espaciais Baseado em Cache Semântico Dependente de Localização Heloise Manica Murilo S. de Camargo Cristina Dutra de Aguiar.
Advertisements

RDFTL: An Event-Condition- Action Language for RDF George Papamarkos Alexandra Poulovassilis Peter T. Wood School of Computer Science and Information Systems.
RAID Redundant Arrays of Independent Disks Courtesy of Satya, Fall 99.
Improving Transaction-Time DBMS Performance and Functionality David Lomet Microsoft Research Feifei Li Florida State University.
Examples of Physical Query Plan Alternatives
Cost-Based Cache Replacement and Server Selection for Multimedia Proxy Across Wireless Internet Qian Zhang Zhe Xiang Wenwu Zhu Lixin Gao IEEE Transactions.
Bypass and Insertion Algorithms for Exclusive Last-level Caches
Distributed Processing, Client/Server and Clusters
External Memory Hashing. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B.
PROVENANCE FOR THE CLOUD (USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES(FAST `10)) Kiran-Kumar Muniswamy-Reddy, Peter Macko, and Margo Seltzer Harvard.
Indications in green = Live content Indications in white = Edit in master Indications in blue = Locked elements Indications in black = Optional elements.
Virtual Memory Chapter 18 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi.
Scalable Content-aware Request Distribution in Cluster-based Network Servers Jianbin Wei 10/4/2001.
Cooperative Caching of Dynamic Content on a Distributed Web Server Vegard Holmedahl, Ben Smith, Tao Yang Speaker: SeungLak Choi, DB Lab., CS Dept.
1 Chapter 2 Database Environment Transparencies © Pearson Education Limited 1995, 2005.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
ICNP'061 Benefit-based Data Caching in Ad Hoc Networks Bin Tang, Himanshu Gupta and Samir Das Department of Computer Science Stony Brook University.
1 PATH: Page Access Tracking Hardware to Improve Memory Management Reza Azimi, Livio Soares, Michael Stumm, Tom Walsh, and Angela Demke Brown University.
Data Sharing in OSD Environment Dingshan He September 30, 2002.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Data Warehouse View Maintenance Presented By: Katrina Salamon For CS561.
THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
Origianal Work Of Hyojun Kim and Seongjun Ahn
Chapter 2 CIS Sungchul Hong
Database Architecture Introduction to Databases. The Nature of Data Un-structured Semi-structured Structured.
CSC271 Database Systems Lecture # 4.
Achieving Non-Inclusive Cache Performance with Inclusive Caches Temporal Locality Aware (TLA) Cache Management Policies Aamer Jaleel,
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
Accessing to Spatial Data in Mobile Environment Presented By Jekkin Shah.
THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Enabling Peer-to-Peer SDP in an Agent Environment University of Maryland Baltimore County USA.
Introduction to Database AIT632 Chapter 1 Sungchul Hong.
Multicache-Based Content Management for Web Caching Kai Cheng and Yahiko Kambayashi Graduate School of Informatics, Kyoto University Kyoto JAPAN.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
ASPLOS’02 Presented by Kim, Sun-Hee.  Technology trends ◦ The rate of frequency scaling is slowing down  Performance must come from exploiting concurrency.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
Layali Rashid, Wessam M. Hassanein, and Moustafa A. Hammad*
Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004.
A Semantic Caching Method Based on Linear Constraints Yoshiharu Ishikawa and Hiroyuki Kitagawa University of Tsukuba
Chapter 2 Database Environment.
BIT 3193 MULTIMEDIA DATABASE CHAPTER 5 : MULTIMEDIA DATABASE MANAGEMENT SYSTEM ARCHITECTURE.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
1 Copyright © 2003 KAIST All Rights Reserved. Using Semantic Caching to Manage Location Dependent Data in Mobile Computing CS 744 Database Lab.
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
A Fragmented Approach by Tim Micheletto. It is a way of having multiple cache servers handling data to perform a sort of load balancing It is also referred.
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Module 11: File Structure
Semantic Data Caching and Replacement
Cache Memory Presentation I
Memory Management for Scalable Web Data Servers
Kalyan Boggavarapu Lehigh University
Distributed P2P File System
Outline Midterm results summary Distributed file systems – continued
External Memory Hashing
Introduction to Databases Transparencies
Database Environment Transparencies
CSE 4340/5349 Mobile Systems Engineering
Fundamentals of Databases
CS510 - Portland State University
Scalable and Efficient Reasoning for Enforcing Role-Based Access Control
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Database System Architectures
Presentation transcript:

Semantic Data Caching and Replacement

Outline Motivation Client Caching Architecture Model of Semantic Caching Simulations and Results Conclusion and Future Work

Motivation Distributed database Clients are high-end workstations (fat client) High computational power. Big local storage

Motivation (Contd.) Effective use of a client is the key to achieving high performance. Less network traffic. Faster response time. Higher server throughput. Better scalability.

Client Caching Architecture Data-Shipping. Clients process query. Data is brought on-demand from servers. Navigational access. Object ID (Tuple ID or Page ID). Can be categorized as tuple-based or page-based Cache Replacement Policies: LRU. MRU.

Client Caching Architecture (Contd.) Data-Shipping. Problem. Applications require associative access to data, that is, as provided by relational query languages.

Client Caching Architecture (Contd.) Query-Shipping. Associative access to data. Problems. Implementations do not support client caching.

Client Caching Architecture (Contd.) Semantic Caching. A model that integrates support for associative access into an architecture based on data-shipping. Advantage. Exploit the semantic information to effectively manage client cache.

Semantic Caching. Semantic description of the data rather than use record-id or page-id. Can be used to generate remainder query to send to server if the requested tuples are not available locally. Information for replacement is maintained as semantic regions. Low overhead, insensitive to bad clustering. Cache replacement use value function based on semantic description. Not just LRU or MRU. Client Caching Architecture (Contd.)

Data Granularity Missing Data Cache Replacement Page Caching GroupFaultingTemporal locality (LRU, MRU) Spatial locality (Clustering) Tuple Caching SingleFaulting Semantic Caching Dynamically Group Remainder Queries Semantic Locality

Model of Semantic Caching Remainder Query Semantic Regions Replacement Issues

Remainder Query Relation Re, query Q, client cache V. Probe query P(Q,V) = Q V can be answered locally. Remainder query R(Q,V) = Q V) should be sent to the server. Example: Select * from E where. salary 30,000. Client cache all the tuples, which salary < 50,000. Q = (salary 30,000). V = (salary <50,000). P = (salary 30,000). R = (salary>=50,000) (salary< 60,000 ). P Re V Q R

Semantic Regions Cache management and replacement unit. Grouped by semantic value. Each semantic region has a single replacement value. Described by a constrained formula. Consideration: Semantic region merge. (a)Original regions(a)Regions after Q

Semantic Regions Cache management and replacement unit. Grouped by semantic value. Each semantic region has a single replacement value. Described by a constrained formula. Consideration: Semantic region merge.(always merge) (a)Original regions(a)Regions after Q

Replacement Issues Temporal locality LRU, MRU

Replacement Issues (Contd.) Semantic locality Manhattan distance (Note) Manhattan distanceDefinition: The distance between two points measured along axes at right angles. In a plane with p 1 at (x 1, y 1 ) and p 2 at (x 2, y 2 ), it is |x 1 - x 2 | + |y 1 - y 2 |. p1p1 p2p2 o | p 1 p 2 | = | p 2 O | + | p 1 O | O O O

Simulation and Result Relation has three candidate keys, Unique2 is indexed and clustered, Unique1 is indexed and unclustered, Unique3 is unindexed and unclustered. RelSize10000Relation size (tuples) TupleSize200Size of tuple (bytes) TuplePerPage20How many tuples per page QuerySize1-10% of relation selected by each query Skew90% of queries within a hot region HotSpot10%Size of the hot region (% of relation) CacheSize250Client Cache size (kb)

Simulation and Result (Contd.) Unique2 (Clustered Index). Performance: Almost the same. Page-based is slightly better. Reason: Page-based overhead is smaller.

Simulation and Result (Contd.) Unique1(Unclustered Index). Performance: Tuple-based and semantic-based. are much better. Reason: Page-based is sensitive to clustered.

Simulation and Result (Contd.) Unique3(UnIndexed and Unclustered). Performance: Semantic-based is better. Reason: Remainder enables client and server. process query in parallel.

Simulation and Result (Contd.) Semantic locality / Manhattan distance on Unique1. Performance: Manhattan distance is better than LRU. Reason: Cold regions will be replaced faster.

Conclusion and Future Work Conclusion. A simple model with selection query, semantic caching provides better performance. Future work. Implementation issues for complex query, update, deletion, and insertion: Concurrency control. Consistency. Completeness. A Predicate-based caching scheme for client-server database architecture. (Arthur M. Keller and Julie Basu)