Data Query in Sensor Networks Carmelissa Valera Jason Torre Carmelissa Valera Jason Torre
Query Processing Constraints computing power communication costs power consumption uncertainty in results
Query Language As in SQL, queries in Cougar and TinyDB are made up of SELECT-FROM-WHERE- GROUPBY-HAVING blocks explicit support for windowing, subqueries TinyDB supports sampling
Simple Aggregate Query SELECT AVG(temp) FROM sensors WHERE floor = 4 SAMPLE PERIOD 5S “Report is the average temperature on the fourth floor once every 10 seconds.” SELECT AVG(temp) FROM sensors WHERE floor = 4 SAMPLE PERIOD 5S “Report is the average temperature on the fourth floor once every 10 seconds.”
Simple Aggregate Query Query Processing for this aggregate query
Architecture Architecture of a sensor query processor “Query Processing in Sensor Networks.” Gherke, Madden, 2004
Query Processing Techniques One way to acquire query answers is FullFlood every network node is contacted query originator node broadcasts query to its neighbors in turn, they broadcast to their neighbors until all nodes have received query node will receive a query several times, but only the first is processed
STWin STWin - Spatio-Temporal Framework 2 phases: Phase 1: Locating a path from the query originator node to a sensor inside the query's spatial window Phase 2: Gathering the query answer from relevant nodes, passing it back to the originator
WinFlood query processing algorithm used in the STWin Framework GreedyDF is Phase 1 of framework WinFlood is in Phase 2 involves constrained parallel flooding if a node is located in the query’s spatial window, it broadcasts the query to its neighbors based on breadth first search
WinFlood constrained flooding begins at query coordinator node stops when query reaches nodes outside the spatial window
WinFlood message flow diagram:
WinFlood Estimating energy costs Cost is divided into three components: 1. the cost to forward the query to the relevant nodes 2. the cost to return their answers to the coordinator C 3. finally, the cost to send the answers from C to the query originator O
WinFlood Energy cost formula Each relevant node will broadcast the query once and receive the query from all its neighbors For small query windows, the rate of increase of the cost is small For large query windows, cost will increase substantially
WinFlood Query answers are returned over the shortest path to the coordinator Coordinator C sends the collected query answers to the originator O over the path discovered by the GreedyDF algorithm in Phase 1 The overall cost of WinFlood algorithm is largely determined by the cost to return the query answers
WinFlood Query answers are returned over the shortest path to the coordinator Coordinator C sends the collected query answers to the originator O over the path discovered by the GreedyDF algorithm in Phase 1 The overall cost of WinFlood algorithm is largely determined by the cost to return the query answers
WinDepth query processing algorithm used in the STWin Framework an alternative to WinFlood for Phase 2 based on depth-first search policy each node forwards the query only to neighbors located within the query's spatial window
WinDepth node receives query and adds node ID to query header node selects a neighbor within the spatial window that hasn't received the query yet determined by query header partial query answer returned by neighbor node checks if any other neighbors are within spatial window and hasn't received query yet if yes, node forwards query to this neighbor and waits for answer repeated until all of a node's neighbors within the window have answered the query
WinDepth message flow diagram:
WinFlood vs. WinDepth WinFlood algorithm uses broadcast messages to forward the query WinDepth nodes send individual messages to neighbors located within the spatial window Since cost of one broadcast message is generally less than the cost for several one-to-one messages, it may save energy to use broadcasting and stop the forwarding at the exterior node.
WinFlood vs. WinDepth WinFlood is faster than WinDepth for the same number of contacted nodes likely to be more cost efficient within a small window due to the use of broadcast messages WinDepth contacts fewer nodes, making more nodes available to answer other queries results in less network congestion better overall query response time if multiple queries are being processed simultaneously
WinDepth Estimating energy costs Cost divided into three components: 1. forward the query to relevant nodes 2. return their answers to C 3. send the answer from C to the query originator O
WinDepth Energy cost formula To estimate cost, we need to assume that algorithm can route the query and receive answers in a single path connecting all relevant nodes
WinDepth Each relevant node receives and forwards query twice also help in returning answers from nodes further from coordinator Performance is very much dependent on the layout of the network formed by the relevant nodes
SPIN Sensor Protocol for Information via Negotiation Uses information descriptors for negotiation Meta – Data Resource manager to regulate energy cost Data is “advertised” to other nodes If the node is interested it will send a request message Sensor Protocol for Information via Negotiation Uses information descriptors for negotiation Meta – Data Resource manager to regulate energy cost Data is “advertised” to other nodes If the node is interested it will send a request message
SPIN Messages SPIN protocol uses three primary messages for communication: 1. ADV: When a SPIN node has new data it send this message to its neighbors containing meta-data 2. REQ: When a SPIN node wished to receive data, it send this message 3. DATA: These are the actual data messages with a meta-data header SPIN protocol uses three primary messages for communication: 1. ADV: When a SPIN node has new data it send this message to its neighbors containing meta-data 2. REQ: When a SPIN node wished to receive data, it send this message 3. DATA: These are the actual data messages with a meta-data header
SPIN Process
Directed Diffusion Data controlled through naming scheme attribute-value pairs Types of pairs can be any useful descriptors Uses an on demand type of query A Sink broadcasts an interest to all neighbor Interest entry also contains several gradients A gradient is a reply link to a neighbor from which the interest was received Data controlled through naming scheme attribute-value pairs Types of pairs can be any useful descriptors Uses an on demand type of query A Sink broadcasts an interest to all neighbor Interest entry also contains several gradients A gradient is a reply link to a neighbor from which the interest was received
Directed Diffusion The interest is compared to the data at each neighbor node for the requested data This process continues until the data is found Using the gradient field a path is established from source to sink This is to find the least cost path Another interest request is sent along the least cost path to reinforce it The interest is compared to the data at each neighbor node for the requested data This process continues until the data is found Using the gradient field a path is established from source to sink This is to find the least cost path Another interest request is sent along the least cost path to reinforce it
Directed Diffusion Once a path is established the data can be sent from the source to the sink Error recovery is possible in this protocol as well Once a path is established the data can be sent from the source to the sink Error recovery is possible in this protocol as well
Directed Diffusion
ACQUIRE Active Query forwarding In sensoR nEtworks Views sensor network as a distributed database This protocol is very good and handling complex queries Often times the queries can contain sub-queries This protocol was created to handle one-shot complex queries that require response from multiple nodes Since this is for one time queries it does not make sense to use any flooding techniques Active Query forwarding In sensoR nEtworks Views sensor network as a distributed database This protocol is very good and handling complex queries Often times the queries can contain sub-queries This protocol was created to handle one-shot complex queries that require response from multiple nodes Since this is for one time queries it does not make sense to use any flooding techniques
ACQUIRE The querying mechanism works as follows: The query is forwarded by the sink Each neighbor node tries to use pre-cached data and forward it to another node If the cached data is out of data then it will query its neighbors d away. Once query complete it is routed back by shortest path to the sink The querying mechanism works as follows: The query is forwarded by the sink Each neighbor node tries to use pre-cached data and forward it to another node If the cached data is out of data then it will query its neighbors d away. Once query complete it is routed back by shortest path to the sink
ACQUIRE The parameter d is dynamically adjusted to allow for the most energy efficient routing If the value is set to large or two small there will be network defficiencies The parameter d is dynamically adjusted to allow for the most energy efficient routing If the value is set to large or two small there will be network defficiencies
COUGAR Data-centric protocol that sees the network as a huge distributed database Declarative queries in order to abstract query processing from the network layer This is done through a new query layer between network and application layers The sensor database it setup so that there is a leader node which is selected This node handles all the reading and calculations The leader node also transmits data to the sink Data-centric protocol that sees the network as a huge distributed database Declarative queries in order to abstract query processing from the network layer This is done through a new query layer between network and application layers The sensor database it setup so that there is a leader node which is selected This node handles all the reading and calculations The leader node also transmits data to the sink
COUGAR The protocol works like this: The sink is responsible for generating a query plan The query plan is sent to the relevant nodes The protocol allows all nodes computation ability so that any one could possibly be the leader The query plan establishes the data that needs to be received and how it should be aggregated by the leader The leader actually queries the network for the data Data is partially aggregated at each hop until it is received at the leader The leader determines what to do with the data and either initiates another query or send data to the sink The protocol works like this: The sink is responsible for generating a query plan The query plan is sent to the relevant nodes The protocol allows all nodes computation ability so that any one could possibly be the leader The query plan establishes the data that needs to be received and how it should be aggregated by the leader The leader actually queries the network for the data Data is partially aggregated at each hop until it is received at the leader The leader determines what to do with the data and either initiates another query or send data to the sink
COUGAR
SAFE Sinks Accessing data From Environments When a node needs data is sends a query to its closest one hop neighbor The information sent includes location, data type, update rate, and duration. Each node has two tables A recent query table Data management table Sinks Accessing data From Environments When a node needs data is sends a query to its closest one hop neighbor The information sent includes location, data type, update rate, and duration. Each node has two tables A recent query table Data management table
SAFE Each node uses a function to handle an arriving query
SAFE When the Pathsetup message is delivered to the sink all intermediate nodes follow another function
SAFE
Advantages/Disadvantages SPIN Advantages Nodes only need to know there one-hop neighbors Provides 3.5 times less energy dissipation than flooding protocols Meta data greatly reduces redundant data Disadvantages Advertising system does not guarantee delivery of data SPIN Advantages Nodes only need to know there one-hop neighbors Provides 3.5 times less energy dissipation than flooding protocols Meta data greatly reduces redundant data Disadvantages Advertising system does not guarantee delivery of data
Advantages/Disadvantages Directed Diffusion Advantages Only queries if data is available No need for addressing mechanism Every node can do aggregation and caching On demand queries only limited power expended Disadvantages Will not work for applications that require constant data to be sent Since naming schemes are unique they have to be defined before deployment Matching of naming requires additional overhead in nodes Directed Diffusion Advantages Only queries if data is available No need for addressing mechanism Every node can do aggregation and caching On demand queries only limited power expended Disadvantages Will not work for applications that require constant data to be sent Since naming schemes are unique they have to be defined before deployment Matching of naming requires additional overhead in nodes
Advantages/Disadvantages ACQUIRE Advantages Very good at complex queries Adjustable d parameter allows more versatile query length Also allows energy efficient routing by changing d Next nodes are selected to maximize query success Disadvantages Transmission costs not taken into account Significant overhead with having to constantly adjust d ACQUIRE Advantages Very good at complex queries Adjustable d parameter allows more versatile query length Also allows energy efficient routing by changing d Next nodes are selected to maximize query success Disadvantages Transmission costs not taken into account Significant overhead with having to constantly adjust d
Advantages/Disadvantages COUGAR Advantages Unique layer to handle queries increases precision Regulates relevant data Leader node handles most overhead Disadvantages Query plan needed for every query Leader node consumes significant energy and can leader to nodes dying Overhead with new layer causing more energy consumption and storage COUGAR Advantages Unique layer to handle queries increases precision Regulates relevant data Leader node handles most overhead Disadvantages Query plan needed for every query Leader node consumes significant energy and can leader to nodes dying Overhead with new layer causing more energy consumption and storage
Advantages/Disadvantages SAFE Advantages Versatile data query of different sinks to same source Able to quickly change data transfer rates Since there are tables are nodes not always necessary to contact sink Disadvantage Initial turn around time high Extra level of overhead SAFE Advantages Versatile data query of different sinks to same source Able to quickly change data transfer rates Since there are tables are nodes not always necessary to contact sink Disadvantage Initial turn around time high Extra level of overhead
THANK YOU Any questions ?
Sources 1. “A Survey on Routing Protocol for Wireless Sensor Newworks,” Kemal Akkaya and Mohamed Younis 2. “Sensor Networks: An Overview,” Archana Bharathidasan and Vijay Anand Sai Ponduru 3. “Data Dissemination over Wireless Sensor Networks,” Sooyeon Kim, Sang H. Son, John A. Stankovic, and Yanhee Choi 4. “Query Processing in Sensor Networks,” Gherke, Madden. 5. "A Framework for Spatio-Temporal Query Processing Over Wireless Sensor Networks," Coman, Nascimento, Sander. 6. "An Analysis of Spatio-Temporal Query Processing in Sensor Networks,” Coman, Sander, Nascimento 1. “A Survey on Routing Protocol for Wireless Sensor Newworks,” Kemal Akkaya and Mohamed Younis 2. “Sensor Networks: An Overview,” Archana Bharathidasan and Vijay Anand Sai Ponduru 3. “Data Dissemination over Wireless Sensor Networks,” Sooyeon Kim, Sang H. Son, John A. Stankovic, and Yanhee Choi 4. “Query Processing in Sensor Networks,” Gherke, Madden. 5. "A Framework for Spatio-Temporal Query Processing Over Wireless Sensor Networks," Coman, Nascimento, Sander. 6. "An Analysis of Spatio-Temporal Query Processing in Sensor Networks,” Coman, Sander, Nascimento