When Urban Air Quality Meets Big Data

Slides:



Advertisements
Similar presentations
Natividad Urquizo, W. Pugsley, D. Spitzer* & M. Robinson City of Ottawa *A-MAPS Environmental Mapping Small Scale Air Pollution Distribution using Satellite.
Advertisements

Mental Mind Gym coming …. 30 Second Challenge - Advanced Additive.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
1 Introduction to Transportation Systems. 2 PART I: CONTEXT, CONCEPTS AND CHARACTERIZATI ON.
Transport Study to support an impact assessment of the Urban Mobility Package on SUMPs CoR Meeting June 13 DG MOVE.
Signatur A web-based tool to test Current and Future perspectives in Air Pollution Forward-Looking Information in Environment Assessment Copenhagen.
Green Mobility in Copenhagen Annette Kayser City of Copenhagen.
AREP GAW Section 3 Introduction and Overview of Course.
1/24 Anthony Steed, Salvatore Spinello, Department of Computer Science, University College London Ben Croxford, Bartlett School of Architecture, University.
Multiplication and Division
Time series of satellite-based fine particulate matter (PM 2.5 ) concentrations over China: Guannan Geng, Qiang Zhang, Aaron van Donkelaar, Randall.
Model-Based Query Processing Over Uncertain Data (in ICDE 2011) Raw Sensor Data Inference of time-varying probability distributions Creating probabilistic.
We had A limited number of permanent noise observation stations Source:
Knowledge Extraction from Technical Documents Knowledge Extraction from Technical Documents *With first class-support for Feature Modeling Rehan Rauf,
Agenda: A little more vocabulary C-V-P analysis Thursday’s class
Vincent W. Zheng, Yu Zheng, Xing Xie, Qiang Yang Hong Kong University of Science and Technology Microsoft Research Asia This work was done when Vincent.
2013 Emissions Inventory Workshop EXAMPLE CALCULATIONS.
Technical Guidance on Biomass Combustion John Abbott.
SCATTER workshop, Milan, 24 October 2003 Testing selected solutions to control urban sprawl The Brussels case city.
Sensing the Pulse of Urban Refueling Behavior Fuzheng Zhang, David Wilkie, Yu Zheng, Xing Xie Microsoft Research Asia.
U-Air: When Urban Air Quality Meets Big Data
Location-Based Social Networks Yu Zheng and Xing Xie Microsoft Research Asia Chapter 8 and 9 of the book Computing with Spatial Trajectories.
University of Minnesota Location-based & Preference-Aware Recommendation Using Sparse Geo-Social Networking Data Location-based & Preference-Aware Recommendation.
Jenny Stocker, David Carruthers & Sam Royston Comments on DELTA version 3.2 with the ADMS-Urban London dataset & updates to the PASODOBLE Myair Model Evaluation.
An experience on modelling-based assessment of the air quality within the Air Quality Directive framework Ana Isabel Miranda, Isabel Ribeiro, Patrícia.
Directive 2008/50/EC of 21 May 2008 on ambient air quality and cleaner air for Europe These slides do not provide a complete description of the requirements.
1 Air quality management in the Netherlands PM 10 in the Rotterdam - Rijnmond Area Leo Sef vd Brussels 1/2 September.
Indoor Air Quality Monitoring System for Smart Buildings
Children's Environmental Health Air Quality Flag Program.
Reece Parker and Justin Cherry, P.E. Air Permits Division Texas Commission on Environmental Quality Advanced Air Permitting Seminar 2014.
CO-AUTHOR RELATIONSHIP PREDICTION IN HETEROGENEOUS BIBLIOGRAPHIC NETWORKS Yizhou Sun, Rick Barber, Manish Gupta, Charu C. Aggarwal, Jiawei Han 1.
Driving with Knowledge from the Physical World Jing Yuan, Yu Zheng Microsoft Research Asia.
Inferring User Political Preferences from Streaming Communications Svitlana Volkova 1, Glen Coppersmith 2 and Benjamin Van Durme 1,2 1 Center for Language.
Urban Computing with Taxicabs
A numerical simulation of urban and regional meteorology and assessment of its impact on pollution transport A. Starchenko Tomsk State University.
Amy Stidworthy NCEO/CEOI-ST Joint Science Conference 26 th June 2014 Sheffield University, UK Using MACC-II Products as Boundary Conditions for Street-Scale.
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
Travel Time Estimation of a Path using Sparse Trajectories
T-Drive : Driving Directions Based on Taxi Trajectories Microsoft Research Asia University of North Texas Jing Yuan, Yu Zheng, Chengyang Zhang, Xing Xie,
fluidyn – PANAIR Fluidyn-PANAIR
Maps of PM2.5 over the U.S. Derived from Regional PM2.5 and Surrogate Visibility and PM10 Monitoring Data Stefan R. Falke and Rudolf B. Husar Center for.
Representativeness Evaluation of China National Climate Reference Station Network Jianxia Guo 1, Ling Chen 2, Haihe Liang 1, Xin Li 3 1. Meteorological.
1 Communicating air quality trade-offs. Module 6. Communicating Air Quality to the Public in the Mid-Atlantic United States by K.G. Paterson, Ph.D., P.E.
Alternative Approaches for PM2.5 Mapping: Visibility as a Surrogate Stefan Falke AAAS Science and Engineering Fellow U.S. EPA - Office of Environmental.
Harikishan Perugu, Ph.D. Heng Wei, Ph.D. PE
Siyuan Liu *#, Yunhuai Liu *, Lionel M. Ni *# +, Jianping Fan #, Minglu Li + * Hong Kong University of Science and Technology # Shenzhen Institutes of.
1 Communicating air quality trade-offs. Module 6. Communicating Air Quality to the Public in the Mid-Atlantic United States.
Regional Modeling Joseph Cassmassi South Coast Air Quality Management District USA.
Traffic Prediction in a Bike-Sharing System
DISSECTING REGIONAL WEATHER-TRAFFIC SENSITIVITY THROUGHOUT A CITY Ye Ding 1, Yanhua Li 2, Ke Deng 3, Haoyu Tan 1, Mingxuan Yuan 4, Lionel M. Ni 5 1 Guangzhou.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining
Forecasting Fine-Grained Air Quality Based on Big Data Date: 2015/10/15 Author: Yu Zheng, Xiuwen Yi, Ming Li1, Ruiyuan Li1, Zhangqing Shan, Eric Chang,
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
ST-MVL: Filling Missing Values in Geo-sensory Time Series Data
SCALECycle and Crowd Augmented Urban Sensing
Managing Massive Trajectories on the Cloud
T-Share: A Large-Scale Dynamic Taxi Ridesharing Service
NAM in D-INSITU Slovenia.
Urban Sensing Based on Human Mobility
DNN-Based Urban Flow Prediction
Author: Hsun-Ping Hsieh, Shou-De Lin, Yu Zheng
Urban Water Quality Prediction based on Multi-task Multi-view Learning
Mining Spatio-Temporal Reachable Regions over Massive Trajectory Data
Chao Zhang1, Yu Zheng2, Xiuli Ma3, Jiawei Han1
Passenger Demand Prediction with Cellular Footprints
Efficient Evaluation of k-NN Queries Using Spatial Mashups
Impact of Land use on Air pollution in Austin
U.S. Perspective on Particulate Matter and Ozone
JDS International seminar 2018
Presentation transcript:

When Urban Air Quality Meets Big Data Yu Zheng Lead Researcher, Microsoft Research

Background Air quality Why it matters Reality NO2, SO2 Air quality monitor station Air quality NO2, SO2 Aerosols: PM2.5, PM10 Why it matters Healthcare Pollution control and dispersal Reality Building a measurement station is not easy A limited number of stations (poor coverage) Beijing only has 15 air quality monitor stations in its urban areas (50kmx40km)

2PM, June 17, 2013

Challenges Air quality varies by locations non-linearly Affected by many factors Weathers, traffic, land use… Subtle to model with a clear formula S11:Dongchengdongsi S12: Nongzhanguan Statistics. Two stations are supposed to be similar. However, according the data of the past 1 year, over 35% time, the deviation of pm2.5 between the two stations are larger than 100. which is about a 2 level distance. That is, when one station is moderate, the other one could be unhealthy. It is not surprising…. So, we cannot do a simple interpolation. Proportion >35%

We do not really know the air quality of a location without a monitoring station! 30,000 + USD, 10ug/m3 202×85×168(mm)

Inferring Real-Time and Fine-Grained air quality throughout a city using Big Data Meteorology Traffic POIs Road networks Human Mobility Historical air quality data Real-time air quality reports Finer granularity inference

http://www.uairquality.com/

Applications Location-based air quality awareness Fine-grained pollution alert Routing based on air quality Identify candidate locations for setup new monitoring stations A step towards identifying the root cause of air pollution

Cloud Cloud + Client MS Azure http://urbanair.msra.cn/ Clients Our system uses a framework that combines cloud with clients. The cloud continuously collects real-time data, processes the data and provides online services. Users can access the air quality information through either a windows phone app or a website. The mobile client allows users to query the air quality of any location at anytime and anywhere. You can download the app by scanning the 2d bar code. The website provides a broader view and more comprehensive information of air quality throughout a city. Clients http://urbanair.msra.cn/

Difficulties Incorporate multiple heterogeneous data sources into a learning model Spatially-related data: POIs, road networks Temporally-related data: traffic, meteorology, human mobility Data sparseness (little training data) Limited number of stations Many places to infer Efficiency request Massive data Answer instant queries

Methodology Overview Partition a city into disjoint grids Extract features for each grid from its impacting region Meteorological features Traffic features Human mobility features POI features Road network features Co-training-based semi-supervised learning model for each pollutant Predict the AQI labels Data sparsity Two classifiers

Semi-Supervised Learning Model Philosophy of the model States of air quality Temporal dependency in a location Geo-correlation between locations Generation of air pollutants Emission from a location Propagation among locations Two sets of features Spatially-related Temporally-related Spatial Classifier Co-Training Temporal Classifier

Evaluation Datasets Data sources Beijing Shanghai Shenzhen Wuhan POI 2012 Q1 271,634 321,529 107,061 102,467 2012 Q3 272,109 317,829 107,171 104,634 Road #.Segments 162,246 171,191 45,231 38,477 Highways 1,497km 1,963km 256km 1,193km Roads 18,525km 25,530km KM 6,100km 9,691km #. Intersec. 49,981 70,293 32,112 25,359 AQI #. Station 22 10 9 Hours 23,300 8,588 6,489 6,741 Time spans 8/24/2012- 3/8/2013 1/19/2013- 3/8/2013 2/4/2013- 3/8/2013 2/4/2013-3/8/2013 Urban Size (grids) 50×50km (2500) 57×45km(2565) 45×25km (1165)

Evaluation Overall performance of the co-training Accuracy

Status http://urbanair.msra.cn/ Publication at KDD 2013: U-Air: when urban air quality inference meets big data Website is publicly available via Azure A mobile client ”Urban Air” n WP App store Component of Urban Air is in CityNext platform On Bing Map China Now Working on prediction http://urbanair.msra.cn/

Thanks! Yu Zheng yuzheng@microsoft.com Homepage