Geographical Data Aggregation

Geographical Data Aggregation
Students: Liran Avital Rachel Mishayev 1/38

Introduction In larger road networks – for instance in a whole city or a network of highways – there will, however, still be a huge amount of data left: the number of road segments in a city, for example, is high, as is the number of parking places. 2/38

Introduction We will discuss data aggregation schemes for VANETs:
Clustering groups of similar vehicles Distributed data clustering Aggregation over hierarchical areas Comparing and merging hierarchical area aggregates Hierarchical aggregation of travel times in road networks 3/38

1.Clustering groups of similar vehicles
Tamer M. Nadeem TrafficView: Tamer Nadeem, Sasan Dashtinezhad, Chunyuan Liao, Liviu Iftode. Department of Computer Science University of Maryland 4/38

Clustering groups of similar vehicles
TrafﬁcView is based on data on the current position and speed of individual vehicles. Locally each car stores a list of vehicle IDs, positions, and speeds. 5/38

Because the amount of data soon exceeds reasonable limits, an adaptive aggregation mechanism is used. 6/38

Locally each car stores a list of vehicle IDs, positions, and speeds. These are transmitted to neighboring vehicles via beacons. vehicle IDs: 1, 2, 3 position: 17, 14,8 speed: 47, 52,48 vehicle IDs: 1, 2, 3 position: 17, 14,8 speed: 47, 52,48 vehicle IDs: 1, 2, 3 position: 17, 14,8 speed: 47, 52,48 7/38

An aggregate record in TrafﬁcView consists of one single speed, position, and timestamp value, along with a list of vehicle IDs. This saves bandwidth for transmitting separate values for each individual vehicle. 8/38

1.Clustering groups of similar vehicles
Khaled Ibrahim, Michele C Weigle CASCADE: Cluster-Based Accurate Syntactic Compression of Aggregated Data in VANETs by Khaled Ibrahim, Michele C Weigle 9/38

An aggregated cluster record contains the cluster position and speed, and one ‘compact record’ per vehicle in the cluster, as shown in the Figure. 10/38

TrafﬁcView, and CASCADE show a number of similarities. Both systems collect information on position and speed of vehicles, and all of them combine information on vehicles with similar parameter values. 11/38

2. Distributed data clustering
StreetSmart does not exchange any data about individual vehicles or road segments at all. In StreetSmart, each vehicle records samples – essentially position and speed – along its own movement path. Individual position data in StreetSmart is represented by a road ID and an offset (i.e., position) along the road. 12/38

3. Aggregation over hierarchical areas
The approaches discussed so far aggregate data from multiple vehicles, or multiple measurements consecutively made by the same vehicle. They are therefore able to significantly reduce the amount of data to be exchanged. 13/38

Aggregation over hierarchical areas
they alone are not sufficient to fully overcome the scalability problem for two reasons : They can sensibly only combine data from the same road and the same driving direction. The second and probably even more important reason is that the amount of data for a complete picture still increases with the number of vehicles. 14/38

TrafﬁcView CASCADE StreetSmart size of the aggregates Number of vehicles 15/38

We will thus now look at proposals where the size and count of aggregated data representations is independent from the number of measuring vehicles, and where the amount of data does not increase linearly with the covered area (or with the number of roads in the covered area). 16/38

A ﬁxed subdivision of the city is used, deﬁned by a quadtree over the two-dimensional plane. A quadtree is a hierarchical subdivision of the plane into smaller and smaller squares. Within a certain radius, information on all individual parking sites is exchanged in the network. ID Hierarchy level of the respective quadtree cell Timestamp Total number of free parking places. 17/38

4. Comparing and merging hierarchical area aggregates
In the previous section we saw that by using timestamps one can easily keep only the most up-to-date value observed for some parameter, discarding older measurements. This is possible because the up-to-dateness can be compared based on the timestamps. Unfortunately, this cannot be transferred in a straightforward way to aggregates describing more than one parameter. 11 12 18 30 up-to-date value 18/38

Comparing and merging hierarchical area aggregates
10 in A (T1) 15 in B (T2) 19 in B (T2) 13 in C (T3) => 25 32 <= X B C Y ??? A Z 18/38

Which one should it keep? Which one is ‘better’? Clearly, the answer to this question is not straightforward, if there is a deﬁnite answer at all. The information in the aggregates is partially overlapping, and none of them clearly dominates the other. 20/38

The central insight from this example is that aggregates will not be ‘perfect’ representations of the current situation in the area they cover, but only the best possible approximation based on the current knowledge of the generating node. 21/38

If different nodes generate aggregates for the same area, these aggregates will typically be based on different, but partially overlapping information. The fundamental issue that therefore arises is that the completeness and up-to-dateness of aggregates can not be expressed through a single timestamp. 10 in A (T1) 15 in B (T2) 19 in B (T2) 13 in C (T3) => 25 32 <= X Y 22/38

A data structure describing the set of contained information and the respective timestamps is obviously also not an option, because it would compromise the aim of small aggregates. Summarization and aggregation mechanisms deal differently with this issue, but typically they resort to using some kind of ‘best guess’ heuristic timestamp. 23/38

we tackle this problem of non-comparability of aggregates, by storing the observations in a duplicate insensitive probabilistic data structure called ‘soft-state sketches’. Duplicate insensitivity means that information which is already present in both source aggregates will not have a higher impact on the result. 24/38

10 in A (T1) 15 in B (T2) 19 in B (T2) 13 in C (T3) => 25 32 <= X B C Y =42 A Z 25/38

Soft-state sketches are based on a data structure proposed by Flajolet and Martin, they adapted them for VANET applications, so that they can be used to collect and distribute measured parameters. They also devised an extension which allows for the automatic removal (‘soft-state’) of old observations, if they are no longer backed up by recent measurements. The drawback of a sketch-based data representation is that it is probabilistic – i.e., it does not store the exact values of the parameters, but probabilistic approximations. 26/38

5. Hierarchical aggregation of travel times in road networks
The aggregation algorithms serve well in the case where the application requires area-based summaries, i.e., when the intention is to obtain an overview of the (total, average, maximum, . . .) value of some parameter within larger and larger areas. 27/38

Hierarchical aggregation of travel times in road networks
A parking guidance system – disseminating the total number of free parking places within an area – is indeed a prime example of such an application. 28/38

Therefore, for a scalable decentralized road navigation system, it is vital to devise away in which travel times through a road network can be represented in an aggregated, hierarchically coarser and coarser way. 29/38

such a system which disseminates travel times along road segments as a basis for navigation systems. There we propose the use of hierarchical approximations of the road network with increasing distance. Travel times along all links between all road junctions are locally observed, and are exchanged within the closer vicinity. 30/38

The more important junctions are then used as ‘aggregation landmarks’, and an approximation of the road network is constructed, which consists of these landmarks and a set of virtual links between them. 31/38

cars that are close enough would be provided with detailed information on the travel times between all shown lower-level landmarks (represented by circles), while cars beyond a certain distance only know that the shortest possible travel time between the high-level landmarks ‘Tour Eiffel’ and ‘Arc de Triomphe’ is 70 s. 32/38

In the true road topology, there may be many possible routes between two landmarks connected by a virtual link. The aggregated value essentially only states that ‘it is possible to drive from landmark A to landmark B within t seconds’. 33/38

the car in the center of the figure has detailed information about the individual road segments in the inner circle available. 34/38

For regions further away on the map, coarser information in the form of virtual links between higher- and higher-level landmarks are used. The travel times along the higher-level virtual links constitute the respective higher level aggregates; 35/38

5. Hierarchical aggregation of travel times in road networks
road categories, road lengths, speed limits, ... i.e., major roads and expressways 36/38

Conclusion We saw how multiple observations can be combined into more compact representations, and how aggregated data can be maintained and updated cooperatively in the network. Because of the inherent limitations of wireless multihop networks, such mechanisms are a vital cornerstone of dissemination based VANET applications. 37/38

Thank you for your attention
The End Thank you for your attention 38/38

Geographical Data Aggregation

Similar presentations

Presentation on theme: "Geographical Data Aggregation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Geographical Data Aggregation

Similar presentations

Presentation on theme: "Geographical Data Aggregation"— Presentation transcript:

Similar presentations

About project

Feedback