Presentation is loading. Please wait.

Presentation is loading. Please wait.

Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li Pusan National University.

Similar presentations


Presentation on theme: "Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li Pusan National University."— Presentation transcript:

1 Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li http://isel.cs.pusan.ac.kr/~lik Pusan National University

2 Why Broadcasting? Simple Data Access Pattern: mostly asymmetric Scalability – Very adequate for massively distributed environments Example DMB TPEG 2

3 TPEG – Transport Protocol Experts Group Broadcasting traffic information protocol 3

4 TPEG – Message format 4

5 TPEG Service Contents Example 5

6 TPEG Service 6

7 Air Update – Map Data Update 7

8 Basic Idea – Broadcast Disks DiskBroadcast Disk Access TimeFrequency (Broadcasting Period) BlockPacket Memory HierarchyMultiple Broadcasting Disks (paper -1) File StructureMessage Format (paper -2) IndexingIndexing Broadcasting (paper – 3) Query ProcessingQuery processing for Broadcasting Data (paper – 4) 8

9 Key papers and documents S. Acharya, et al. “Broadcast Disks: Data Management for Asymmetric Communication Environments”, ACM SIGMOD 1996, pp.199-210 T. Imielinkski, S. Viswanathan, and B.R. Badrinath, “Data on Air: Organization and Access”, IEEE TKDE Vol.9 No.3, 1997, pp.353-372 J. Xu et al. “Energy Efficient Indexing for Quering Location Dependent Data in Mobile Broadcasting Environments, ICDE 2003, pp.239-250 B. Zheng et al. “Spatial Queries in Wireless Broadcast Systems”, Wireless Network, Vol.10, pp.723-736, 2004 tisa.org, TPEG, http://www.tisa.org/assets/Uploads/Public/TISA14001 TPEGWhatisitallabout2014.pdf 9

10 Paper #1 – Broadcasting disks in SIGMOD 1995 10

11 Key Ideas Broadcasting as a disk How to organize broadcast message Flat Message as a disk Message with different frequencies as multiple disks Two Issues How to organize message – Server Side How to maintain cache – Client Side 11

12 Message Format Given three data items A, B, and C to broadcast with different access probability, 12 Flat format Skewed format Multiple disks format

13 Performance Measures 13

14 Message Formatting Method - Server Algorithm 1. Sort and classify pages by access probability 2. Determine relative frequency of each disk (page) 3. Partition each disk into a set of chunks 4. Define the message format with multiple disks Example 4 pages/cycle 14 Relative frequencies F(T 1 )=1, F(T 2 )=2, F(T 3 )=4 LCM=4 minor cycles Length(T 3 )/LCM=2 Major Cycle=S*LCM

15 Caching Policy at Client Replacement Policy Not LRU Point 1 Caching hottest page – problematic. If a page is considered as a hottest page by server, then frequent broadcasting, and therefore caching is not really necessary Point 2 Server’s policy is to minimize the average delay != Local Demands 15

16 Caching Policy at Client For a given item A, we need to consider Broadcasting frequency (X) and Local access probability (P) Replacement in terms of PIX (P/X) instead of LRU 16

17 Paper #2 – Organization and Access, TKDE 9(3), 1997 17

18 Key Ideas Disk Access – Disk Access Time Two different measures Latency and Energy Consumption Data Access Time in Data on Air Tuning Time: Amount of time spent by a client listening to the channel  Power Consumption Latency: Time elapsed from the time that a client requests data to the point of completing data downloads Tuning time + Latency  Data Access Time 18

19 Broadcast data format 19 Bucket ID Bcast ptr idx ptr Bucket type Bucket... bcast Without Index, we need a full scanning of a bcast Issue How to organize and Where to place Index For reducing tuning time and latency

20 Data Access 20... 1. Client joins here Index 2. Wait until the index arrives 3. Wait until data bucket arrives... 4. Read data

21 Where to place Index 21 No Index Single Index (1,m) Index  What’s the difference?  Probably (1,m) may improve the performance

22 How to organize Full duplication vs. Relevant Duplication 22

23 No replication 23

24 Entire Path Replication 24

25 Distributed Index 25


Download ppt "Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li Pusan National University."

Similar presentations


Ads by Google