Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Texas at Dallas

Similar presentations


Presentation on theme: "University of Texas at Dallas"— Presentation transcript:

1 University of Texas at Dallas
Small Sensor, Big Data Ding-Zhu Du University of Texas at Dallas UT Tyler 1

2 University of Texas at Dallas
Small Sensor and Big Data Lidong Wu University of Texas at Dallas UT Tyler 2

3 Sensor BigData Digitized World Drowning in Vast Amount of Data
Nowadays, we live in a digitized world. Almost everything we do, creates a record stored somewhere, whether we are calling a friend, purchasing gas, connecting to Internet, or monitoring any other social activities. Sensor is a tiny device to be used to record those signals in the “real world” and send them as structured and unstructured data to the intelligent system. The data is wildly used on Social Networks, which are made up of countless social actors. Moreover, The increasing volume and detail of data captured by sensors, is so large that it's difficult to process using traditional techniques. It's clear to see the whole process can be divided into two parts: Data collection and the attendant Data analysis. One problems is, when sensor collecting data, how to save resource and improve the performance of sensor system. Another problem is how to deal with the giant data in real time? Specifically, on social networks. 3

4 Outline Data Collection in Sensor System
Data Analysis on Social Networks Kate Middleton Effect, Search cheap ticket Final Remarks So the outline of my talk today, is mainly focusing on two parts: efficient Data collection in sensor system and the attendant big data analysis on Social networks. Explain the list. 4

5 Outline Data Collection in Sensor System
Data Analysis on Social Networks Kate Middleton Effect, Search cheap ticket Final Remarks

6 Have you watched movie Twister?
tornado In the movie, a team of storm chasers tries to deploy their data-gathering Instrument (i.e., a can of hundreds of sensors), into the funnel of a tornado, to study its structure from the inside, and eventually design an advanced storm warning system. So, we can see sensor is an important tool to collect data. People may think Story in movies is kinda far away from our life. So you may wonder where are the sensors in the reality world. Bucket of sensors sensor 6

7 Where are all the sensors?
Smartphone with a dozen of sensors Firstly, the Smartphone that we can’t survive without is equipped with a dozen of sensors. Can you name one? Bluetooth, GPS, wifi, temperature, accelerometer,… 7

8 Where are all the sensors?
Wearable devices - Google Glass, Apple’s iWatch More and more wearables are being developed, including Google Glass or Apple’s iWatch. To spend $1500, you can have your agenda and appointment notes ready at a glance, dictate your text messages and save your time for other tasks. 8

9 Where are all the sensors?
Buildings In this buildings, We can also find an array of sensors such as Webcam, AC controller, alarm system… 9

10 Where are all the sensors?
Transportation systems, etc All kinds of transportation systems, including public traffic, as part of smart cities. Sensor in vehicle receives GPS coordinates in real-time and sends them to the tracking company via a cellular data service. 10

11 Sensor Web Large # of simple sensors Usually deployed randomly
Multi-hop wireless link Distributed routing No infrastructure Collect data and send it to base station When lots of sensors are deployed in a certain area to act and coordinate as a whole to monitor environmental, it’s called sensor web, also sensor network.  Among wireless sensors, they use have wireless links with distributed routing. and they can self organize into a connected communication network without infrastructure. 11

12 Applications of Senor Web

13 An example of sensor web
observer

14 BUT… What’s Sensor? Small size Large number Tether- less
The technology is improving everyday. Sensor size is getting smaller. BUT… 14

15 What’s limiting the task?
Energy, Sense, Communication scale, CPU... Due to the available technology, usually sensors are battery powered, which limits the performance of network. Each operation brings the sensor closer to death. It has to keep listening to make sure that it does not lose any data and therefore cannot power down entirely to save energy when it has nothing to do. This scenario can dramatically limit the battery autonomy of the node, demanding that batteries be replaced or recharged regularly. However, under some hostile environment, like the volcanic surrounding area, battery replacement is impossible. And also, each sensor has a limited sensing range and communication range. Anything out of its reach, cannot be sensed by it. CPU. Sensor has to pass data to the base station for analysis. 15

16 Coverage & Connectivity System is alive!!
Challenge Target is Covered? Sensor system is Connected? Coverage & Connectivity Golden Rule, then we say System is alive!!

17 Coverage & Connectivity
Communication Range d ≤ Rs sensor target communication radius sensing radius Rc Rs Sensing Range

18 Coverage & Connectivity
Communication Range d ≤ Rs d ≤ Rc sensor target communication radius sensing radius Rc Rs Sensing Range

19 Min-Connected Sensor Cover Problem
A uniform set of sensors, and a target area Find a minimum # of sensors to meet two requirements: [Coverage] cover the target area, and [Connectivity] form a connected communication network. [Resource Saving] sensing disks communication network Figure: Min-CSC Problem.

20 Min-Connected Sensor Cover Problem
It’s NP-hard! Previous Work for PTAS Ο(r ln n) – approximation given by Gupta, Das and Gu [MobiHoc’03, 2003], where n is the number of sensors and r is the link radius of the sensor network.

21 Main Results Random algorithm:
Ο(log3n log log n)-approximation, n is the number of sensors. Partition algorithm : Ο(r)-approximation, r is the link radius of the network.

22 Algorithm 1 1 2 Connected Sensor Cover with Target Area
Connected Sensor Cover with Target Points Group Steiner Tree 1 2 Min-CSC Min-CTC GST With a random algorithm which with probability 1- ɛ, produces an Ο(log3n log log n) - approximation.

23 1 2 Min-CSC Min-CTC GST The following theorem indicates that those intersection points can be used to indicate whether the area Ω is covered or not. 23

24 Min-Connected Sensor Cover Problem
1 2 Min-CSC Min-CTC GST Min-Connected Sensor Cover Problem A uniform set of sensors, and a target area Find a minimum # of sensors to meet two requirements: [Coverage] cover the target area, and [Connectivity] form a connected communication network. The following theorem indicates that those intersection points can be used to indicate whether the area Ω is covered or not. How to map to GST? 24

25 Min-Connected Target Coverage Problem
1 2 Min-CSC Min-CTC GST Min-Connected Target Coverage Problem A uniform set of sensors, and a target POINTS Find a minimum # of sensors to meet two requirements: [Coverage] cover the target POINTS, and [Connectivity] form a connected communication network. The following theorem indicates that those intersection points can be used to indicate whether the area Ω is covered or not. How to map to GST? 25

26 This tree has minimum weight.
1 2 Min-CSC Min-CTC GST Group Steiner Tree: A graph G = (V, E) with positive edge weight c for every edge e ∈ E. A specified vertex r k subsets (or groups) of vertices G1,...,Gk, Gi ⊆ V Find a minimum total weight tree T contains at least one vertex in each Gi. Figure: GST Problem. This tree has minimum weight.

27 1 2 Min-CSC Min-CTC GST Coverage Choose at least one sensor
b2 b6 b3 b4 b1 b5 b7 b3 b1 b2 b6 b5 b4 S1 S2 S3 S4 b7 S1 S2 S1 S3 S1 S2 S3 S2 S3 S2 S4 S3 S4 * Gi contains all sensors covering bi. S2 S3 S4 Choose at least one sensor from each group. Coverage

28 1 2 Min-CSC Min-CTC GST Connectivity Consider communication network.
b2 b6 b3 b4 b1 b5 b7 S1 S2 S1 S3 S1 S2 S3 S2 S3 S2 S4 S3 S4 * Gi contains all sensors covering bi. S2 S3 S4 b3 b1 b2 b6 b5 b4 S1 S2 S3 S4 b7 Consider communication network. Connectivity

29 1 2 Min-CSC Min-CTC GST Min- Coverage & Connectivity
b2 b6 b3 b4 b1 b5 b7 S1 S2 S1 S3 S1 S2 S3 S2 S3 S2 S4 S3 S4 * Gi contains all sensors covering bi. S2 S3 S4 b3 b1 b2 b6 b5 b4 S1 S2 S3 S4 b7 Find a group Steiner tree in communication network. Min- Coverage & Connectivity Use minimum # of sensors to build a tree which contains at least one sensor from each group. This is exactly a GST problem. 29

30 1 2 Min-CSC Min-CTC GST Garg, Konjevod and Ravi [SODA, 2000] showed with probability 1- ε an approximation solution of GROUP STEINER TREE on tree metric T is within a factor of Ο(log2 n log log n log k) from optimal. Triangle inequality. 30

31 What Is Link Radius? Communication disk Sensing disk

32 Algorithm 2 2 1 Connected Sensor Cover with Target Area
Connected Sensor Cover with Target Points 1 2 Min-CSC Min-CTC Min-TC Connect output of Min-TC into Min-CTC. It can be done in Ο(r) - approximation. Refer to my paper [INFOCOM 2013’].

33 There exists a polynomial-time (1 + ε)- approximation for MIN-TC.
Step 2 Target Coverage There exists a polynomial-time (1 + ε)- approximation for MIN-TC. Green is an opt (TC), Orange is an approx (TC). # < (1+ε) · opt (TC), < (1+ε) · opt (CTC)

34 Step 2 Network Steiner Tree
Let S′ ⊆ S be a (1 + ε)-approximation for MIN-TC. Assign weight one to every edge of G. Interconnect sensors in S′ to compute a Steiner tree T as network Steiner minimum tree. Byrka et al. [6] showed there exists a polynomial-time1.39-approximation of for Network Steiner Minimum Tree. Red is an approx (TC). Green is an opt (Network ST), All sensors on the tree form an approx for min CTC. # nodes % approx for min CTC = # edges +1 % approx for Network ST < 1.39 · opt (Network ST) +1 < 1.39 · ??? · opt (CTC) + 1

35 Step 2 Network Steiner Tree
Yellow is an approx (TC). Green is an opt (CTC). Each orange line has distance < r. opt (Network ST) < opt (CTC) r · # = opt (CTC) · O(r) Note: # < (1+ε) · opt (CTC)

36 n is the number of sensors.
Future Works Ο(log3n log log n) n is the number of sensors. Ο(r) r is the link radius. 1. Unknown Relationship? 2. Constant-appro for Min-CSC? wants provable solution quality and provable run-time bounds. The approximation is optimal up to a small constant factor.  36

37 What I have done? Publications on Optimization
“Constant-Approximations for Target Coverage Problem in Wireless Sensor Networks” INFOCOM2012 (with Weili Wu, et al.) “Approximations for Minimum Connected Sensor Cover” INFOCOM2013 (with Weili Wu, et al.) “PTAS for Routing-Cost Constrained Minimum Connected Dominating Sets …” Journal of Combinatorial Optimization, (with Weili Wu, et al.) “An Approximation Algorithm for Client Assignment …” INFOCOM2014 (with Weili Wu, et al.)

38 NSF Support Above work was supported under the following grants CCF : Reliable Spatial-Temporal Coverage with Minimum Cost in Wireless Sensor Network Deployments CNS : Undersea Sensor Networks for Intrusion Detection: Foundations and Practice CNS : Throughput Optimization in Wireless Mesh Sensor Networks Dr. Pardalos serves as Distinguished Professor of Industrial and Systems Engineering at the University of Florida. Let’s move on to … 38

39 Outline Data Collection in Sensor System
Data Analysis On Social Networks Kate Middleton Effect, Search cheap ticket Final Remarks

40 “The small world network
is a type of mathematical graph in which most nodes are not neighbors of one another, but most nodes can be reached from every other by a small number of hops or steps.” In this talk, we will discuss two projects, Effective Data collection in sensor web and the attendant big data revolution in data networks. 40

41 Social Network: A New Frontier
Most of social networks are small world networks with large size. is a type of mathematical graph in which most nodes are not neighbors of one another, but most nodes can be reached from every other by a small number of hops or steps. 41

42 Six Steps of Separation
Milgram (1967) The experiment: Random people from Nebraska were to send a letter (via intermediaries) to a stock broker in Boston. Could only send to someone with whom they know. Among the letters that found the target, the average number of steps was six. Six degrees of separation is the theory that everyone and everything is six or fewer steps away. People did lots of experiment research on it. Stanley Milgram ( ) It’s a small world after all!!! 42

43 Six Steps of Separation
Friend Friend Roommate Family Friend Interviewer Supervisor Friend But, you may have a shorter route to XX, which would shrink my distance from this rich guy. A chain of "a friend of a friend" statements can connect any two people in a maximum of six steps. Family Friend 43

44 Social Networks in Life

45 Increasing Popularity
There are lots of significant research on data analysis on social networks. 45

46 Usage Example 1 “Kate Middleton Effect
The trend effect that Kate, Duchess of Cambridge has on others, from cosmetic surgery for brides, to sales of coral-colored jeans.” “Kate Middleton Effect Duchess of Cambridge, Kate, is a fashion icon who leads in fashion circles. 46

47 Hike in Sales of Special Products
According to Newsweek, "The Kate Effect may be worth £1 billion to the UK fashion industry." Tony DiMasso, L. K. Bennett’s US president, stated in 2012, "...when she does wear something, it always seems to go on a waiting list."

48 How to Find Kate? Influential Person
Kate is one of the persons that have many friends in this social network. For more kates, it’s not as easy as you might think!

49 Find More Kate? Challenge: an overall consideration of influence
For example, Positive Influence, Influence Maximization, Influence Minimization When there are more Kates, an overall consideration about influence may be required. Therefore, the following concept was proposed. Different Models, different problems… 49

50 Influence Maximization
Given k Find k seeds (Kates) to maximize the number of influenced persons.

51 Influence Maximization
# of influenced nodes is 6. 51

52 Influence Maximization
# of influenced nodes is 6. # of influenced nodes is 16. 52

53 Ongoing Research Initial Result
“Better Approximations for Influence Maximization in Online Social Networks” Journal of Combinatorial Optimization, 2013 (with Weili Wu, et al.) 雇佣一定数目的kate, 总共影响的更多。。。Semidi programming…. 53

54 Usage Example 2 Search Cheap Ticket
There are about 28,537 commercial flights in the sky in the U.S. on any given day. Kate, Duchess of Cambridge has on others, from cosmetic surgery for brides, to sales of coral-colored jeans. The fashion choices of the Duchess of Cambridge, Kate Middleton, have already brought an astounding $1.5 billion into the British economy, 54

55 How to find cheap ticket?
It is a shortest path problem in a big data network.

56 Cheap ticket-Graph AA123 AA456 Chicago Dallas AA789

57 Cheap ticket-Graph 8am 8am 9am 9am 1pm 1pm 3pm 3pm Dallas Each city has a set of startpoints and a set of endpoints. They are connected into a bipartite graph based on certain rules.

58 Cheap Ticket-Graph Dallas 8am 8am 9am 9am 1pm 1pm 3pm 3pm

59 Cheap Ticket-Graph 8am 8am 9am 9am 1pm 1pm 3pm 3pm Dallas

60 Challenge Time VS Price
If searching area is larger, then searching needs more time, but ticket price may be cheaper. Hard to do it in real-time Better software is needed The main issue here is the conflict of searching time and ticket price. Better software leads to success of a business. 60

61 Ongoing Research Initial Result
“Social Network Path Analysis Based on HBase” CSoNet 2013 (with Weili Wu, et al.)

62 Outline Data Collection in Sensor System
Data Analysis On Social Networks Kate Middleton Effect, Search cheap ticket Final Remarks

63 NSF Grant Possibilities
In SSS Program & Big Data Program SSS (Sensor and Sensing Systems): sensor networks with application in industrial engineering. Big Data Program: Critical Techniques and Technologies for Advancing Big Data Science & Engineering technological means of managing, analyzing, visualizing, and extracting useful information from large, diverse, distributed and heterogeneous data sets. 63

64 NSF Grant Possibilities
In REU Program Research Experiences for Undergraduates (REU) program supports active research participation by undergraduate students in any area funded by NSF. REU : Verification and Validation for Software Safety (co-PI: Weili Wu) technological means of managing, analyzing, visualizing, and extracting useful information from large, diverse, distributed and heterogeneous data sets. 64

65 THANK YOU!

66

67

68 Mathematical Model (II)
Positive Influence Dominating Set Problem Given a graph, find a positive influence dominating set with minimum cardinality. Theorem

69 Mathematical Model (I)
Dominating set Positive influence dominating set

70


Download ppt "University of Texas at Dallas"

Similar presentations


Ads by Google