Presentation on theme: "University of Texas at Dallas"— Presentation transcript:
1 University of Texas at Dallas Small Sensor,Big DataDing-Zhu DuUniversity of Texas at DallasUT Tyler1
2 University of Texas at Dallas Small Sensor andBig DataLidong WuUniversity of Texas at DallasUT Tyler2
3 Sensor BigData Digitized World Drowning in Vast Amount of Data Nowadays, we live in a digitized world. Almost everything we do, creates a record stored somewhere, whether we are calling a friend, purchasing gas, connecting to Internet, or monitoring any other social activities.Sensor is a tiny device to be used to record those signals in the “real world” and send them as structured and unstructured data to the intelligent system. The data is wildly used on Social Networks, which are made up of countless social actors. Moreover, The increasing volume and detail of data captured by sensors, is so large that it's difficult to process using traditional techniques.It's clear to see the whole process can be divided into two parts: Data collection and the attendant Data analysis.One problems is, when sensor collecting data, how to save resource and improve the performance of sensor system. Another problem is how to deal with the giant data in real time? Specifically, on social networks.3
4 Outline Data Collection in Sensor System Data Analysis on Social NetworksKate Middleton Effect, Search cheap ticketFinal RemarksSo the outline of my talk today, is mainly focusing on two parts:efficient Data collection in sensor system and the attendant big data analysis on Social networks.Explain the list.4
5 Outline Data Collection in Sensor System Data Analysis on Social NetworksKate Middleton Effect, Search cheap ticketFinal Remarks
6 Have you watched movie Twister? tornadoIn the movie, a team of storm chasers tries to deploy their data-gathering Instrument (i.e., a can of hundreds of sensors), into the funnel of a tornado, to study its structure from the inside, and eventually design an advanced storm warning system.So, we can see sensor is an important tool to collect data.People may think Story in movies is kinda far away from our life. So you may wonder where are the sensors in the reality world.Bucket ofsensorssensor6
7 Where are all the sensors? Smartphone with a dozen of sensorsFirstly, the Smartphone that we can’t survive without is equipped with a dozen of sensors. Can you name one? Bluetooth, GPS, wifi, temperature, accelerometer,…7
8 Where are all the sensors? Wearable devices- Google Glass, Apple’s iWatchMore and more wearables are being developed, including Google Glass or Apple’s iWatch. To spend $1500, you can have your agenda and appointment notes ready at a glance, dictate your text messages and save your time for other tasks.8
9 Where are all the sensors? BuildingsIn this buildings, We can also find an array of sensors such as Webcam, AC controller, alarm system…9
10 Where are all the sensors? Transportation systems, etcAll kinds of transportation systems, including public traffic, as part of smart cities.Sensor in vehicle receives GPS coordinates in real-time and sends them to the tracking company via a cellular data service.10
11 Sensor Web Large # of simple sensors Usually deployed randomly Multi-hop wireless linkDistributed routingNo infrastructureCollect data and send it to base stationWhen lots of sensors are deployed in a certain area to act and coordinate as a whole to monitor environmental, it’s called sensor web, also sensor network. Among wireless sensors, they use have wireless links with distributed routing.and they can self organize into a connected communication network without infrastructure.11
14 BUT… What’s Sensor? Small size Large number Tether- less The technology is improving everyday. Sensor size is getting smaller.BUT…14
15 What’s limiting the task? Energy, Sense, Communication scale, CPU...Due to the available technology, usually sensors are battery powered, which limits the performance of network. Each operation brings the sensor closer to death. It has to keep listening to make sure that it does not lose any data and therefore cannot power down entirely to save energy when it has nothing to do. This scenario can dramatically limit the battery autonomy of the node, demanding that batteries be replaced or recharged regularly. However, under some hostile environment, like the volcanic surrounding area, battery replacement is impossible.And also, each sensor has a limited sensing range and communication range. Anything out of its reach, cannot be sensed by it.CPU. Sensor has to pass data to the base station for analysis.15
16 Coverage & Connectivity System is alive!! ChallengeTarget is Covered?Sensor system is Connected?Coverage & ConnectivityGolden Rule, then we saySystem is alive!!
17 Coverage & Connectivity Communication Ranged ≤ Rssensortargetcommunication radiussensing radiusRcRsSensing Range
18 Coverage & Connectivity Communication Ranged ≤ Rsd ≤ Rcsensortargetcommunication radiussensing radiusRcRsSensing Range
19 Min-Connected Sensor Cover Problem A uniform set of sensors, and a target areaFind a minimum # of sensorsto meet two requirements:[Coverage] cover the target area, and[Connectivity] form a connected communication network.[Resource Saving]sensing diskscommunicationnetworkFigure: Min-CSC Problem.
20 Min-Connected Sensor Cover Problem It’s NP-hard!Previous Work for PTASΟ(r ln n) – approximation given by Gupta, Das and Gu [MobiHoc’03, 2003], where n is the number of sensors and r is the link radius of the sensor network.
21 Main Results Random algorithm: Ο(log3n log log n)-approximation, n is the number of sensors.Partition algorithm :Ο(r)-approximation, r is the link radius of the network.
22 Algorithm 1 1 2 Connected Sensor Cover with Target Area Connected Sensor Cover with Target PointsGroup Steiner Tree12Min-CSCMin-CTCGSTWith a random algorithm which with probability 1- ɛ, produces an Ο(log3n log log n) - approximation.
23 12Min-CSCMin-CTCGSTThe following theorem indicates that those intersection points can be used to indicate whether the area Ω is covered or not.23
24 Min-Connected Sensor Cover Problem 12Min-CSCMin-CTCGSTMin-Connected Sensor Cover ProblemA uniform set of sensors, and a target areaFind a minimum # of sensorsto meet two requirements:[Coverage] cover the target area, and[Connectivity] form a connected communication network.The following theorem indicates that those intersection points can be used to indicate whether the area Ω is covered or not.How to map to GST?24
25 Min-Connected Target Coverage Problem 12Min-CSCMin-CTCGSTMin-Connected Target Coverage ProblemA uniform set of sensors, and a target POINTSFind a minimum # of sensorsto meet two requirements:[Coverage] cover the target POINTS, and[Connectivity] form a connected communication network.The following theorem indicates that those intersection points can be used to indicate whether the area Ω is covered or not.How to map to GST?25
26 This tree has minimum weight. 12Min-CSCMin-CTCGSTGroup Steiner Tree:A graph G = (V, E) with positive edge weight c for every edge e ∈ E.A speciﬁed vertex rk subsets (or groups) of vertices G1,...,Gk, Gi ⊆ VFind a minimum total weight tree T contains at least one vertex in each Gi.Figure: GST Problem.This tree has minimum weight.
27 1 2 Min-CSC Min-CTC GST Coverage Choose at least one sensor b2b6b3b4b1b5b7b3b1b2b6b5b4S1S2S3S4b7S1 S2S1 S3S1 S2 S3S2 S3S2 S4S3 S4* Gi contains all sensors covering bi.S2 S3 S4Choose at least one sensorfrom each group.Coverage
28 1 2 Min-CSC Min-CTC GST Connectivity Consider communication network. b2b6b3b4b1b5b7S1 S2S1 S3S1 S2 S3S2 S3S2 S4S3 S4* Gi contains all sensors covering bi.S2 S3 S4b3b1b2b6b5b4S1S2S3S4b7Consider communicationnetwork.Connectivity
29 1 2 Min-CSC Min-CTC GST Min- Coverage & Connectivity b2b6b3b4b1b5b7S1 S2S1 S3S1 S2 S3S2 S3S2 S4S3 S4* Gi contains all sensors covering bi.S2 S3 S4b3b1b2b6b5b4S1S2S3S4b7Find a group Steiner tree incommunication network.Min-Coverage & ConnectivityUse minimum # of sensors to build a tree which contains at least one sensor from each group. This is exactly a GST problem.29
30 12Min-CSCMin-CTCGSTGarg, Konjevod and Ravi [SODA, 2000] showed with probability 1- ε an approximation solution of GROUP STEINER TREE on tree metric T is within a factor of Ο(log2 n log log n log k) from optimal.Triangle inequality.30
31 What Is Link Radius?Communication diskSensing disk
32 Algorithm 2 2 1 Connected Sensor Cover with Target Area Connected Sensor Cover with Target Points12Min-CSCMin-CTCMin-TCConnect output of Min-TC into Min-CTC. It can be done in Ο(r) - approximation.Refer to my paper [INFOCOM 2013’].
33 There exists a polynomial-time (1 + ε)- approximation for MIN-TC. Step 2 Target CoverageThere exists a polynomial-time (1 + ε)- approximation for MIN-TC.Green is an opt (TC),Orange is an approx (TC).# < (1+ε) · opt (TC),< (1+ε) · opt (CTC)
34 Step 2 Network Steiner Tree Let S′ ⊆ S be a (1 + ε)-approximation for MIN-TC. Assign weight one to every edge of G. Interconnect sensors in S′ to compute a Steiner tree T as network Steiner minimum tree.Byrka et al.  showed there exists a polynomial-time1.39-approximation of for Network Steiner Minimum Tree.Red is an approx (TC).Green is an opt (Network ST),All sensors on the tree form an approxfor min CTC.# nodes % approx for min CTC= # edges +1 % approx for Network ST< 1.39 · opt (Network ST) +1< 1.39 · ??? · opt (CTC) + 1
35 Step 2 Network Steiner Tree Yellow is an approx (TC).Green is an opt (CTC).Each orange line has distance < r.opt (Network ST)< opt (CTC) r · #= opt (CTC) · O(r)Note: # < (1+ε) · opt (CTC)
36 n is the number of sensors. Future WorksΟ(log3n log log n)n is the number of sensors.Ο(r)r is the link radius.1. Unknown Relationship?2. Constant-appro for Min-CSC?wants provable solution quality and provable run-time bounds. The approximation is optimal up to a small constant factor. 36
37 What I have done? Publications on Optimization “Constant-Approximations for Target Coverage Problem in Wireless Sensor Networks” INFOCOM2012 (with Weili Wu, et al.)“Approximations for Minimum Connected Sensor Cover” INFOCOM2013 (with Weili Wu, et al.)“PTAS for Routing-Cost Constrained Minimum Connected Dominating Sets …” Journal of Combinatorial Optimization, (with Weili Wu, et al.)“An Approximation Algorithm for Client Assignment …” INFOCOM2014 (with Weili Wu, et al.)
38 NSF SupportAbove work was supported under the following grantsCCF : Reliable Spatial-Temporal Coverage with Minimum Cost in Wireless Sensor Network DeploymentsCNS : Undersea Sensor Networks for Intrusion Detection: Foundations and PracticeCNS : Throughput Optimization in Wireless Mesh Sensor NetworksDr. Pardalos serves as Distinguished Professor of Industrial and Systems Engineering at the University of Florida.Let’s move on to …38
39 Outline Data Collection in Sensor System Data Analysis On Social NetworksKate Middleton Effect, Search cheap ticketFinal Remarks
40 “The small world network is a type of mathematical graph in which most nodes are not neighbors of one another, but most nodes can be reached from every other by a small number of hops or steps.”In this talk, we will discuss two projects, Effective Data collection in sensor web and the attendant big data revolution in data networks.40
41 Social Network: A New Frontier Most of social networks are small world networks with large size.is a type of mathematical graph in which most nodes are not neighbors of one another, but most nodes can be reached from every other by a small number of hops or steps.41
42 Six Steps of Separation Milgram (1967)The experiment:Random people from Nebraska were to send a letter (via intermediaries) to a stock broker in Boston.Could only send to someone with whom they know.Among the letters that foundthe target, the average number of steps was six.Six degrees of separation is the theory that everyone and everything is six or fewer steps away. People did lots of experiment research on it.Stanley Milgram ( )It’s a small world after all!!!42
43 Six Steps of Separation FriendFriendRoommateFamilyFriendInterviewerSupervisorFriendBut, you may have a shorter route to XX, which would shrink my distance from this rich guy.A chain of "a friend of a friend" statements can connect any two people in a maximum of six steps.FamilyFriend43
45 Increasing Popularity There are lots of significant research on data analysis on social networks.45
46 Usage Example 1 “Kate Middleton Effect The trend effect that Kate, Duchess of Cambridge has on others, from cosmetic surgery for brides, to sales of coral-colored jeans.”“Kate Middleton EffectDuchess of Cambridge, Kate, is a fashion icon who leads in fashion circles.46
47 Hike in Sales of Special Products According to Newsweek, "The Kate Effect may be worth £1 billion to the UK fashion industry."Tony DiMasso, L. K. Bennett’s US president, stated in 2012, "...when she does wear something, it always seems to go on a waiting list."
48 How to Find Kate? Influential Person Kate is one of the persons that have many friends in this social network.For more kates, it’s not as easy as you might think!
49 Find More Kate? Challenge: an overall consideration of influence For example, Positive Influence, Influence Maximization, Influence MinimizationWhen there are more Kates, an overall consideration about influence may be required.Therefore, the following concept was proposed.Different Models, different problems…49
50 Influence Maximization Given kFind k seeds (Kates)to maximize the number of influenced persons.
51 Influence Maximization # of influenced nodes is 6.51
52 Influence Maximization # of influenced nodes is 6.# of influenced nodes is 16.52
53 Ongoing Research Initial Result “Better Approximations for Influence Maximization in Online Social Networks” Journal of Combinatorial Optimization, 2013 (with Weili Wu, et al.)雇佣一定数目的kate, 总共影响的更多。。。Semidi programming….53
54 Usage Example 2 Search Cheap Ticket There are about 28,537 commercial flights in the sky in the U.S. on any given day.Kate, Duchess of Cambridge has on others, from cosmetic surgery for brides, to sales of coral-colored jeans.The fashion choices of the Duchess of Cambridge, Kate Middleton, have already brought an astounding $1.5 billion into the British economy,54
55 How to find cheap ticket? It is a shortest path problem in a big data network.
60 Challenge Time VS Price If searching area is larger, then searching needs more time, but ticket price may be cheaper.Hard to do it in real-timeBetter software is neededThe main issue here is the conflict of searching time and ticket price.Better software leads to success of a business.60
61 Ongoing Research Initial Result “Social Network Path Analysis Based on HBase” CSoNet 2013 (with Weili Wu, et al.)
62 Outline Data Collection in Sensor System Data Analysis On Social NetworksKate Middleton Effect, Search cheap ticketFinal Remarks
63 NSF Grant Possibilities In SSS Program & Big Data ProgramSSS (Sensor and Sensing Systems): sensor networks with application in industrial engineering.Big Data Program: Critical Techniques and Technologies for Advancing Big Data Science & Engineeringtechnological means of managing, analyzing, visualizing, and extracting useful information from large, diverse, distributed and heterogeneous data sets.63
64 NSF Grant Possibilities In REU ProgramResearch Experiences for Undergraduates (REU) program supports active research participation by undergraduate students in any area funded by NSF.REU : Verification and Validation for Software Safety (co-PI: Weili Wu)technological means of managing, analyzing, visualizing, and extracting useful information from large, diverse, distributed and heterogeneous data sets.64