1 Mobile Data Management Sanjay Kumar Madria Department of Computer Science University of Missouri-Rolla Rolla, MO 65401.

1 Mobile Data Management Sanjay Kumar Madria Department of Computer Science University of Missouri-Rolla Rolla, MO 65401

2 Wireless Technologies Wireless local area networks (WaveLan, Aironet – Possible Transmission error Cellular wireless – Low bandwidth Packet radio (Metricom) -Low Bandwidth Satellites (Inmarsat, Iridium) – Long Latency

3 Mobility Constraints CPU Power Bandwidth Delay tolerance Physical size Constraints on peripherals and GUIs (modality of interaction) Locations change dynamically

4 Why Mobile Data Mgmt? Wireless Connectivity and use of PDA’s, handheld computing devices on the rise Workforces will carry extracts of corporate databases with them Need central database repositories to serve these work groups and keep them fairly upto-date and consistent

5 Applications Sales Force Automation - especially in pharmaceutical industry, consumer goods, parts Financial Consulting and Planning Insurance and Claim Processing - Auto, General, and Life Insurance Real Estate/Property Management, Maintenance and Building Contracting

6 Data Processing Scenario One server or many servers (corporate data, inventory, HR, orders/billing) Shared Data Some Local Data per client, mostly subset of global data Need for accurate, up-to-date information Limitations Short connect time per session Infrequent connections Clients may remain dormant for extended periods of time Clients not reachable from servers at all times

7 What is Mobility? A device that moves –Between different geographical locations –Between different networks A person who moves –Between different geographical locations –Between different networks –Between different communication devices –Between different applications

8 Device mobility Plug in laptop at home/work on Ethernet –Occasional long breaks in network access –Wired network access only (connected => well- connected) –Network address changes –Only one type of network interface –May want access to information when no network is available: hoard information locally Cell phone with access to cellular network –Continuous connectivity –Phone # remains the same (high-level network address) –Network performance may vary from place to place

9 Device mobility…. Can we achieve best of both worlds? –Continuous connectivity of wireless access –Performance of better networks when available Laptop moves between Ethernet, WaveLAN and Metricom networks –Wired and wireless network access –Potentially continuous connectivity, but may be breaks in service –Network address changes –Radically different network performance on different networks

10 People mobility Phone available at home or at work –Multiple phone numbers to reach me –Breaks in my reachability when I’m not in Cell phone –Only one number to reach me –Continuously reachable –Sometimes poor quality and expensive connectivity Cell phone, networked PDA, etc. –Multiple numbers/addresses for best quality connection –Continuous reachability –Best choice of address may depend on sender’s device or message content

11 Mobility means changes How does it affect the following? Hardware –Lighter –More robust –Lower power Wireless communication –Can’t tune for stationary access Network protocols –Name changes –Delay changes –Error rate changes

12 Changes…... Fidelity –High fidelity may not be possible Data consistency –Strong consistency no longer possible Location/transparency awareness –Transparency not always desirable Names/addresses –Names of endpoints may change Security –Lighter-weight algorithms –Endpoint authentication harder –Devices more vulnerable

13 Changes…... Performance –Network, CPU all constrained –Delay and delay variability Operating systems –New resources to track and manage: energy Applications –Name changes –Changes in connectivity –Changes in quality of resources People –Introduces new complexities, failures, devices

14 Example changes Addresses –Phone numbers, IP addresses Network performance –Bandwidth, delay, bit error rates, cost, connectivity Network interfaces –PPP, eth0, strip Between applications –Different interfaces over phone & laptop Within applications –Loss of bandwidth triggers change from color to B&W Available resources –Files, printers, displays, power, even routing

15 Most RDBMS vendors support the ISDB scenario - but no design and optimization aids Specialized Environments for ISDB apps: Sybase Remote Server Synchrologic iMOBILE Microsoft SQL server - mobile app support Oracle Lite Xtnd-Connect-Server (Extended Technologies) Scoutware (Riverbed Technologies)

16 Personal Communication System (PCS) Wireless Components

17 Personal Communication System (PCS) Mobile cells

18 Personal Communication System (PCS) Mobile cells The entire coverage area is a group of a number of cells. The size of cell depends upon the power of the base stations.

19 Personal Communication System (PCS) Frequency reuse D = distance between cells using the same frequency R = cell radius N = reuse pattern (the cluster size, which is 7). Thus, for a 7-cell group with cell radius R = 3 miles, the frequency reuse distance D is 13.74 miles.

20 Personal Communication System (PCS) Problems with cellular structure  How to locate of a mobile unit in the entire coverage area? Solution: Location management  How to maintain continuous communication between two parties in the presence of mobility? Solution: Handoff  How to maintain continuous communication between two parties in the presence of mobility? Solution: Roaming

21 Personal Communication System (PCS) Handoff A process, which allows users to remain in touch, even while breaking the connection with one BS and establishing connection with another BS.

22 Personal Communication System (PCS) Handoff To keep the conversation going, the Handoff procedure should be completed while the MS (the bus) is in the overlap region.

23 Personal Communication System (PCS) Handoff types with reference to the network  Intra-system handoff or Inter-BS handoff The new and the old BSs are connected to the same MSC.

24 Personal Communication System (PCS) Intra-system handoff or Inter-BS handoff Steps 1. The MU (MS) momentarily suspends conversation and initiates the handoff procedure by signaling on an idle (currently free) channel in the new BS. Then it resumes the conversation on the old BS.

25 Personal Communication System (PCS) Intra-system handoff or Inter-BS handoff 2. Upon receipt of the signal, the MSC transfers the encryption information to the selected idle channel of the new BS and sets up the new conversation path to the MS through that channel. The switch bridges the new path with the old path and informs the MS to transfer from the old channel to the new channel.

26 Personal Communication System (PCS) Intra-system handoff or Inter-BS handoff 3. After the MS has been transferred to the new BS, it signals the network and resumes conversation using the new channel.

27 Personal Communication System (PCS) Intra-system handoff or Inter-BS handoff 4. Upon the receipt of the handoff completion signal, the network removes the bridge from the path and releases resources associated with the old channel.

28 Personal Communication System (PCS) Handoff types with reference to the network  Intersystem handoff or Inter-MSC handoff The new and the old BSs are connected to different MSCs.

29 Personal Communication System (PCS) Roaming Administrative constraints  Billing.  Subscription agreement.  Call transfer charges.  User profile and database sharing.  Any other policy constraints.

30 Personal Communication System (PCS) Roaming Technical constraints  Bandwidth mismatch. For example, European 900MHz band may not be available in other parts of the world.  Service providers must be able to communicate with each other. Needs some standard.  Mobile station constraints.

31 Personal Communication System (PCS) Roaming Two basic operations in roaming management are  Registration (Location update): The process of informing the presence or arrival of a MU to a cell.  Location tracking: the process of locating the desired MU.

32 Personal Communication System (PCS) Registration Two-Tier Scheme HLR: Home Location Register A HLR stores user profile and the geographical location. VLR: Visitor Location Register A VLR stores user profile and the current location who is a visitor to a different cell that its home cell.

33 Personal Communication System (PCS) Registration Two-Tier Scheme steps. MU1 moves to cell 2.

34 Personal Communication System (PCS) Registration Steps 1. MU1 moves to cell 2. The MSC of cell 2 launches a registration query to its VLR 2. 2. VLR2 sends a registration message containing MU’s identity (MIN), which can be translated to HLR address. 3. After registration, HLR sends an acknowledgment back to VLR2. 4. HLR sends a deregistration message to VLR1 (of cell 1) to delete the record of MU1 (obsolete). VLR1 acknowledges the cancellation.

35 Personal Communication System (PCS) Location tracking Steps 1. VLR of cell 2 is searched for MU1’s profile. 2. If it is not found, then HLR is searched. 3. Once the location of MU1 is found, then the information is sent to the base station of cell 1. 4. Cell 1 establishes the communication.

36 Personal Communication System (PCS) Location tracking Two-Tier Scheme steps location search

37 Personal Communication System (PCS) Location tracking Two-Tier Scheme steps location update

38 Mobile Database Systems (MDS) A Reference Architecture (Client-Server model)

39 Data Processing Issues Processing at the Server Processing at the Client Update Propagation and Installation Consistency Management Less Serious: –Concurrent Transactions –Client Data Recovery

41 Database Issues in Mobile Computing Query and Transaction Processing Replication Management Location Management Limitations –Data Distribution, Mobility Management and Scalability –Role of wireless medium in info distribution –Dealing with short battery life –Dealing with prolonged disconnection Periods –Bandwidth Management

42 Mobility Management and Scalability Location management Changing topologies Handoffs Resource finding Replication Resource sharing

43 Bandwidth Management Clients assumed to have weak and/or unreliable communication capabilities Broadcast--scalable but high latency On-demand--less scalable and requires more powerful client, but better response Client caching allows bandwidth conservation

44 Energy Management Battery life expected to increase by only 20% in the next 10 years Reduce the number of messages sent Doze modes Power aware system software Power aware microprocessors Indexing wireless data to reduce tuning time

45 Large Impact Distributed data management Querying wireless data Handling/representing fast-changing data Scale Tariff-driven query optimization Security User interfaces

46 Query Processing New Issues –Energy Efficient Query Processing – Location Dependent Query Processing Old Issues - New Context –Cost Model

47 Location Management New Issues –Tracking Mobile Users Old Issues - New Context –Managing Update Intensive Location Information –Providing Replication to Reduce Latency for Location Queries –Consistent Maintenance of Location Information

48 Transaction Processing New Issues – Recovery of Mobile Transactions – Lock Management in Mobile Transaction Old Issues - New Context Extended Transaction Models – Partitioning Objects while Maintaining Correctness

Dissemination-based Data Delivery Using Broadcast Disks

50 Broadcast Disk Proposes a mechanism called Broadcast Disks to provide database access to mobile clients. Server continuously and repeatedly broadcasts data to a mobile client as it goes by. Multiple disks of different sizes are superimposed on the broadcast medium. Exploits the client storage resources for caching data.

Server Broadcast Programs Data server must construct a broadcast “program” to meet the needs of the client population. Server would take the union of required items and broadcast the resulting set cyclically. Single additional layer in a client’s memory hierarchy - flat broadcast. In a flat broadcast the expected wait for an item on the broadcast is the same for all items.

52 Server Broadcast Programs

Broadcast Disks are an alternative to flat broadcasts. Broadcast is structured as multiple disks of varying sizes, each spinning at different rates.

Server Broadcast Programs Flat Broadcast Skewed Broadcast Multi-disk Broadcast

Server Broadcast Programs For uniform access probabilities a flat disk has the best expected performance. For increasingly skewed access probabilities, non-flat disk programs perform better. Multi-disk programs perform better than the skewed programs.

Server Broadcast Programs Generating a multi-disk broadcast Number of disks (num_disks) determine the number of different frequencies with which pages will be broadcast. For each disk, the number of pages and the relative frequency of broadcast (rel_freq(i)) are specified.

57 Server Broadcast Programs

58 Client Cache Management Improving the broadcast for one probability access distribution will hurt the performance of other clients with different access distributions. Therefore the client machines need to cache pages obtained from the broadcast.

59 Client Cache Management With traditional caching clients cache the data most likely to be accessed in the future. With Broadcast Disks, traditional caching may lead to poor performance if the server’s broadcast is poorly matched to the clients access distribution.

60 Client Cache Management In the Broadcast Disk system, clients cache the pages for which the local probability of access is higher than the frequency of broadcast. This leads to the need for cost-based page replacement.

61 Client Cache Management One cost-based page replacement strategy replaces the page that has the lowest ratio between its probability of access (P) and its frequency of broadcast (X) - PIX PIX requires the following: 1Perfect knowledge of access probabilities. 2Comparison of PIX values for all cache resident pages at cache replacement time.

62 Client Cache Management Another page replacement strategy adds the frequency of broadcast to an LRU style policy. This policy is known as LIX. LIX maintains a separate list of cache- resident pages for each logical disk Each list is ordered based on an approximation of the access probability (L) for each page. A LIX value is computed by dividing L by X, the frequency of broadcast. The page with the lowest LIX value is replaced.

63 Prefetching An alternative approach to obtaining pages from the broadcast. Goal is to improve the response time of clients that access data from the broadcast. Methods of Prefetching: Tag Team Caching Prefetching Heuristic

64 Prefetching Tag Team Caching - Pages continually replace each other in the cache. For example two pages x and y, being broadcast, the client caches x as it arrives on the broadcast. Client drops x and caches y when y arrives on the broadcast.

65 Prefetching Simple Prefetching Heuristic Performs a calculation for each page that arrives on the broadcast based on the probability of access for the page (P) and the amount of time that will elapse before the page will come around again (T). If the PT value of the page being broadcast is higher than the page in cache with the lowest PT value, then the page in cache is replaced.

66 Read/Write Case With dynamic broadcast there are three different changes that have to be handled. 1Changes to the value of the objects being broadcast. 2Reorganization of the broadcast. 3Changes to the contents of the broadcast.

67 Conclusion Broadcast Disks project investigates the use of data broadcast and client storage resources to provide improved performance, scalability and availability in networked applications with asymmetric capabilities.

69 Mobility System conguration is no longer static: the center of activity, the topology, the system load, and locality, change dynamically need to search to locate objects various forms of heterogeneity

70 Wireless Communications offer less bandwidth more expensive less reliable Consequently, connectivity is weak and often intermittent

71 Portable Devices light and small to be easily carried around Such considerations, in conjunction with a given cost and level of technology ) mobile elements with less resources (e.g., memory, screen size and disk capacity) reliance on battery can be more easily accidentally damaged, stolen, or lost, thus, less secure and reliable

72 Mobile units are still characterized as: unreliable and prone to hard failures, i.e., theft, loss or accidental damage, resource-poor relative to static hosts. Examples: InfoPad [16] and ParcTab [28] projects

73 Adaptability A mobile system is presented with resources of varying number and quality: Connectivity conditions vary from total disconnections to full connectivity Available resources are not static either, for instance a docked" mobile computer may have access to a larger display or memory. the location of mobile elements changes and so does the network conguration and the center of computational activity

74 Example: during disconnection, a mobile host may work autonomously, while during periods of strong connectivity, depend heavily on the xed network sparing its scarce local resources

75 Disconnections: disconnected operation - autonomous operation of a mobile host during disconnection. Weak connectivity: Operation should be tuned for communication environments characterized by low bandwidth, high latency, and expensive prices. Mobility: Basic support such as as establishing new communication links as well as advanced support such as migrating executing processes and database transactions in progress. Failure recovery: Since mobile elements are prone to hard failures, methods for failure handling and recovery are important.

76 Data Dissemination by Broadcast Pull-based data delivery or on demand data delivery: A client explicitly requests data items from the server. Push-based data delivery: The server repetitively broadcasts data to a client population without a specic request. Clients monitor the broadcast and retrieve the data items they need as they arrive.

77 Applications: Dissemination-based: information feeds such as stock quotes and sport tickets, electronic newsletters, mailing lists, traffic and weather information systems, cable TV on the Internet Commercial Products for example: the AirMedia's Live Internet broadcast network [6] Hughes Network Systems' DirectPC [26] Teletext and Videotex systems [11, 28]

78 The Datacycle project [16] at Bellcore: a database circulates on a high bandwidth network (140 Mbps). Users query the database by ltering information via special massively parallel transceivers. The Boston Community Information System (BCIS) [18]: broadcast news and information over an FM channel to clients with personal computers equipped with radio receivers

79 Hybrid Delivery Push vs Pull Push suitable when information is transmitted to a large number of clients with overlapping interests the server saves several messages the server is prevented from being overwhelmed by client requests. Push is scalable: performance does not depend on the number of clients Pull cannot scale beyond the capacity of the server or the network. In push, access is only sequential; Thus, access latency degrades with the volume of data In pull, clients play a more active role

80 Hybrid Delivery clients are provided with an uplink channel, called backchannel, to send messages to the server. Sharing the channel : if the same channel is used for both broadcast delivery and for the transmission of the replies to on demand requests

81 Use of the backchannel - to provide feedback and prole information to the server - to directly request data Which pages? to avoid overwhelming the server Page i not in cache and the number of items scheduled to appear before i on the broadcast is greater than a threshold parameter

82 Selective Broadcast Broadcast an appropriately selected subset of items and provide the rest on demand In [25], the broadcast is used as an air-cache for storing frequently requested data. The broadcast content continuously adjusts to match the hot-spot of the database. The hot-spot is calculated by observing broadcast misses indicated by explicit requests for data not on the broadcast. In [19]: the database is partitioned into: a \publication group" that is broadcast and an \on demand" group. The criterion for partitioning is to minimize the backchannel requests while constraining the response time below a predened upper limit.

83 On Demand Broadcast the server chooses the next item to broadcast on every broadcast tick based on the requests for data it has received Various strategies [28]: broadcast the pages in the order they are requested (FCFS), or the page with the maximum number of pending requests. A parameterized algorithm for large-scale data broadcast based only on the current queue of pending requests [7

84 Organization of Broadcast Data Access time: average time elapsed from the moment a client expresses its interest to an item to the receipt of the item on the broadcast channel Tuning time: the amount of time spent listening to the broadcast channel Organize the broadcast to minimize access and tuning time

85 Efficient Concurrency Control for Broadcast Environments Jayavel Shanmugasundaram Arvind Nithrakashyap Rajendran Sivasankaran Krithi Ramamritham

86 Outline Broadcast environments Inapplicability of existing techniques Suitable correctness criterion Mechanisms Performance Results Conclusion

87 Why Broadcast Data? Millions of clients that need to see current and consistent data  Server handling all client requests ==> not scalable More scalable solution:  Periodically broadcast all data items  Clients read items off broadcast  Datacycle [Herman], Broadcast Disks [Acharya]

88 Example: eAuctions Numerous potential clients Only a small fraction contact server to offer bids Need access to current and consistent data

89 Broadcast Environment Characteristics Large number of clients Mobile clients with scarce power resource ==> Low client to server bandwidth Plentiful server to client bandwidth ==> Asymmetric communication medium

90 Mutually Consistent Reads R(x)R(y)R(z) time (broadcast cycles) Are x, y, and z mutually consistent? TrBegin TrEnd

92 Why Not Traditional CC Techniques? Approach 1: Dynamic conflict resolution –Excessive communication –e.g., locking: acquiring read locks by client transactions server swamped with lock requests client uses precious uplink bandwidth Approach 2: Avoid potential serializability conflicts

93 Schedules C4C4 W4(y)W4(y) ServerW2(x)W2(x) C2C2 Client A R1(y)R1(y) R1(x)R1(x) time

94 Serialization Orders C4C4 W4(y)W4(y) ServerW2(x)W2(x) C2C2 Client A R1(y)R1(y) R1(x)R1(x) T 2 T 4 T4T1T2T4T1T2

95 Serialization Orders C4C4 W4(y)W4(y) ServerW2(x)W2(x) C2C2 Client A R1(y)R1(y) R1(x)R1(x) R3(x)R3(x) R3(y)R3(y) Client B T 2 T 4 T4T1T2T4T1T2 Even if Client B does not exist, Client A will have to abort transaction T 1

96 Serializability? Serializability - a global property All read only transactions: –Required to see same serial order of update transactions, even if executing at different clients –Required to be serializable w.r.t. all update transactions, even if updates do not affect values read Inappropriate for broadcast environments

98 Broadcast Data Requirements Mutual consistency –server maintains mutually consistent data –clients read mutually consistent data Currency –clients see data that is current

99 A Sufficient Criterion All update transactions are serializable. Each read-only transaction is serializable with respect to the update transactions it (directly or indirectly) reads from. C4C4 W4(y)W4(y) ServerW2(x)W2(x) C2C2 Client A R1(y)R1(y) R1(x)R1(x) R3(x)R3(x) R3(y)R3(y) Client B T2T4T2T4 T4T1T4T1 T2T3T2T3

100 A Sufficient Criterion All update transactions are serializable. Each read-only transaction is serializable with respect to the update transactions it (directly or indirectly) reads from. external consistency [Weihl 87] update consistency [Bober and Carey 92]

101 Implications Decoupled correctness criterion –Clients need not communicate with server or other clients Weaker correctness criterion –Reduces unnecessary aborts

103 The Algorithm F-Matrix Server functionality Client functionality Nature of Control Information –broadcast by the server with the data –helps clients determine consistency of reads Client read-only validation protocol

104 Server Functionality Ensures conflict serializability of update transactions Broadcasts during each cycle –Committed values of data items at start of cycle –Control matrix Incrementally maintains control matrix as updates occur

105 Client Functionality consult control information transmitted during that cycle to determine whether the read operation can proceed if read operation cannot proceed the transaction is aborted. Read update tr : (write set + values) along with (read set + cycle numbers) sent to server read tr : commit succeeds Commit Write performed on a local copy of the data item in the client. no checks are made

106 Control Matrix: Intuition C4C4 W4(y)W4(y) Server W2(x)W2(x) C2C2 Client R 1 ( y ) R1(x)R1(x) C4C4 R 4 ( x ) W 4 ( y ) ServerW2(x)W2(x) C2C2 Client R 1 ( y ) R1(x)R1(x) T is currently reading y T had read x Did any tr that affected y change x after T read it?

107 Control Matrix Objects: n objects all initialized at cycle 0 C: n x n control matrix C(x,y) = max( cycle in which T commits ), where T affects the latest committed value of y and also writes to x

108 Precond. for Consistent Reads T previously read x from broadcast cycle b RT = set of (x,b) pairs C is the matrix at the beginning of current cycle read y iff read-condition(y) holds: forall (x,b) in RT, C(x,y) < b i.e., no transaction that affected y wrote x after t read x

109 Smaller Control Matrix Partition objects into groups Control matrix: n x numgroups SC(x,s) = max y in s C(x, y) Updating an object in s = update to any object in s Fewer entries to transmit compared to C group 1 group2 read-condition(y): forall (x, b) in RT SC(i, s) < b T is currently reading y T had read x No tr that affected any object in y ‘s group changed x after T read it

110 Group Size Increasing size of group => more unnecessary conflicts Reducing size of group => increased control information overhead. – n groups => F-Matrix – one group => Datacycle achieves serializability Read-condition for Datacycle : no previously read object has been updated

111 R-Matrix To achieve Mutual Consistency Read condition: objects previously read have not been updated by other transactions or the object being read has not been updated since the beginning of the transaction

113 Effect of Client Tr. Length F-Matrix -- has best perf. -- scales very well Datacycle R-Matrix F-Matrix F-Matrix-ideal

114 Summary of Results F-Matrix > R-Matrix > Datacycle –Weaker abort condition leads to better response times F-Matrix is highly scalable with respect to –Client/Server transaction length –Server transaction rate –Number of Objects/Size of Objects R-Matrix better only at very small object sizes In many cases F-Matrix is very close to F-Matrix-ideal

115 Conclusion Need for mutual consistency + currency Efficient mechanism - F-matrix R-matrix is a low overhead alternative F-matrix delivers! In Paper: Caching to exploit weak currency requirements

116 Related Work Broadcast Environments –Datacycle [Herman] : supports serializability –Broadcast Disks [Acharya] : consistency not considered –ProMotion [Chrysanthis]: assumes caches

117 Examples Business data, e.g., Vitria, Tibco Election coverage data Stock related data Traffic information Electronic auctions Data Server

1 Mobile Data Management Sanjay Kumar Madria Department of Computer Science University of Missouri-Rolla Rolla, MO 65401.

Similar presentations

Presentation on theme: "1 Mobile Data Management Sanjay Kumar Madria Department of Computer Science University of Missouri-Rolla Rolla, MO 65401."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Mobile Data Management Sanjay Kumar Madria Department of Computer Science University of Missouri-Rolla Rolla, MO 65401.

Similar presentations

Presentation on theme: "1 Mobile Data Management Sanjay Kumar Madria Department of Computer Science University of Missouri-Rolla Rolla, MO 65401."— Presentation transcript:

Similar presentations

About project

Feedback