Origin-Destination matrix estimation in Sri Lanka using mobile network big data Danaja Maldeniya, Sriganesh Lokanathan and Amal Kumarage (Phd) 13th International.

Slides:



Advertisements
Similar presentations
VTrack: Energy-Aware Traffic Delay Estimation Using Mobile Phones Lenin Ravindranath, Arvind Thiagarajan, Katrina LaCurts, Sivan Toledo, Jacob Eriksson,
Advertisements

Spatiotemporal Pattern Mining For Travel Behavior Prediction UIC IGERT Seminar 02/14/2007 Chad Williams.
Norman Washington Garrick CE 2710 Spring 2014 Lecture 07
The Current State and Future of the Regional Multi-Modal Travel Demand Forecasting Model.
Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
Frank Yu Australian Bureau of Statistics Unstructured Data 1.
What is the Model??? A Primer on Transportation Demand Forecasting Models Shawn Turner Theo Petritsch Keith Lovan Lisa Aultman-Hall.
Subarea Model Development – Integration of Travel Demand across Geographical, Temporal and Modeling Frameworks Naveen Juvva AECOM.
1 William Lee Duke University Department of Electrical and Computer Engineering Durham, NC Analysis of a Campus-wide Wireless Network February 13,
A reactive location-based service for geo-referenced individual data collection and analysis Xiujun Ma Department of Machine Intelligence, Peking University.
Archived Data User Services (ADUS). ITS Produce Data The (sensor) data are used for to help take transportation management actions –Traffic control systems.
Dieter Pfoser, LBS Workshop1 Issues in the Management of Moving Point Objects Dieter Pfoser Nykredit Center for Database Research Aalborg University, Denmark.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
Lec 29: Ch3.(T&LD): Traffic Analysis – Non-site traffic forecast Understand why estimating non-site traffic forecast is necessary Know three principal.
Opportunities & Challenges Using Passively Collected Data In Travel Demand Modeling 15 th TRB Transportation Planning Applications Conference Atlantic.
ONS Big Data Project. Plan for today Introduce the ONS Big Data Project Provide a overview of our work to date Provide information about our future plans.
Big Data A big step towards innovation, competition and productivity.
An introduction May Offermans, Martijn Tennekes, Alex Priem, Shirley Ortega en Nico Heerschap Using Mobile Phone Meta Data For National Statistics.
Violence, Sectarianism and Patterns of Communication in Yemen MURI Presentation Christia, Dahleh, Jadbabaei, Leskovec, 1.
Validate - A Nationwide Dynamic Travel Demand Model for Germany Peter Vortisch, Volker Waßmuth, PTV AG, Germany.
TRANSPORTATION PLANNING. TOPICS 1.ROADS AND PUBLIC GOODS 2.RATIONALE TO JUSTIFY ROAD BUILDING 3.URBAN PLANNING AND TRAFFIC CONGESTION (UNINTENDED CONSEQUENCES)
Software and hardware solution for remote vehicle monitoring based on GLONASS/GPS navigation.
1 Road network vulnerability Important links and areas, exposed users Erik Jenelius Dept. of Transport and Economics Royal Institute of Technology (KTH)
Problem Statement and Motivation Key Achievements and Future Goals Technical Approach Kouros Mohammadian, PhD and Yongping Zhang (PhD Candidate), CME,
6 am 11 am 5 pm Fig. 5: Population density estimates using the aggregated Markov chains. Colour scale represents people per km. Population Activity Estimation.
8fleet Proposal v1 Technical Support - | | Sales & Marketing -
Copyright 2010, The World Bank Group. All Rights Reserved. Integrating Agriculture into National Statistical Systems Section A 1.
Freight Analysis Framework version 3 (FAF3) __________ Talking Freight Webinar October 2010.
Technology and Society The DynamIT project Dynamic information services and anonymous travel time registration VIKING Workshop København Per J.
Population Movements from Anonymous Mobile Signaling Data An Alternative or Complement to Large- Scale Episodic Travel Surveys?
January Utah Statewide Household Travel Study Study overview and results.
Computers in Urban Planning Computational aids – implementation of mathematical models, statistical analyses Data handling & intelligent maps – GIS (Geographic.
Where did you come from? Where did you go? Robust policy relevant evidence from mobile network big data Danaja Maldeniya, Amal Kumarage, Sriganesh Lokanathan,
Business Logistics 420 Public Transportation Lecture 18: Demand Forecasting.
The Science of Prediction Location Intelligence Conference April 4, 2006 How Next Generation Traffic Services Will Impact Business Dr. Oliver Downs, Chief.
Transportation Planning, Transportation Demand Analysis Land Use-Transportation Interaction Transportation Planning Framework Transportation Demand Analysis.
© 2010 IBM Corporation IBM Research - Ireland © 2014 IBM Corporation xStream Data Fusion for Transport Smarter Cities Technology Centre IBM Research.
Geography 417/517: Introduction to GIS Introductory Materials.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
FDOT Transit Office Modeling Initiatives The Transit Office has undertaken a number of initiatives in collaboration with the Systems Planning Office and.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
Putting the LBRS and other GIS data to Work for Traffic Flow Modeling in Erie County Sam Granato, Ohio DOT Carrie Whitaker, Erie County 2015 Ohio GIS Conference.
Big and Open Data: Challenges and Issues
1 Unstructured Data (UD) What is unstructured data? How is it statistically valuable? Challenges of turning UD into information.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
IoT Meets Big Data Standardization Considerations
1 1 1 Progress Report 2013 Chinese Taipei Chinese Taipei EC/EDI Committee 29 November 2013 HCMC, Vietnam 2013 AFACT Plenary.
Using mobile network big data for land use classification Kaushalya Madhawa, Sriganesh Lokanathan, Danaja Maldeniya, Rohan Samarajiva CPRsouth 2015 Taipei.
Generated Trips and their Implications for Transport Modelling using EMME/2 Marwan AL-Azzawi Senior Transport Planner PDC Consultants, UK Also at Napier.
Travel Demand Forecasting: Traffic Assignment CE331 Transportation Engineering.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
Nico Heerschap, Luxembourg, 2015 Mobile positioning and other ‘big’ data for tourism statistics Experience Statistics Netherlands.
A COMPARATIVE STUDY Dr. Shahram Tahmasseby Transportation Systems Engineer, The City of Calgary Calgary, Alberta, CANADA.
Ayubowan. Statistical Overview of the Telecommunications Sector in Sri Lanka A S W Bandusiri Statistical Officer TRCSL Tel: Fax:
Transportation Modeling – Opening the Black Box. Agenda 6:00 - 6:05Welcome by Brant Liebmann 6:05 - 6:10 Introductory Context by Mayor Will Toor and Tracy.
Travel in the Twenty-First Century: Peak Car and beyond David Metz Centre for Transport Studies University College London.
Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.
Atlantic Coast Operations Business Intelligence Mobility Project.
New data sources (such as Big Data) and Traditional Sources Work Package 2.
Applications in Mobile Technology for Travel Data Collection 2012 Border to Border Transportation Conference South Padre Island, Texas November, 13, 2012.
Big data Analytics for Tourism Destination management
Administrative data, calling patterns and spatial economics: Impact evaluation drawing on multiple data sources Nathaniel Young (EBRD)
BIG Data 25 Need-to-Know Facts.
1st November, 2016 Transport Modelling – Developing a better understanding of Short Lived Events Marcel Pooke – Operational Modelling & Visualisation Manager.
Facility Location Chapter #4.
Using Google’s Aggregated and Anonymized Trip Data to Estimate Dynamic Origin-Destination Matrices for San Francisco TRB Applications Conference 2017 Bhargava.
University of Washington, Autumn 2018
University of Washington, Autumn 2018
Big Data in Official Statistics: Generalities
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Global Smart Card Market to hit $65 Bn by 2025: Global Market Insights.
Presentation transcript:

Origin-Destination matrix estimation in Sri Lanka using mobile network big data Danaja Maldeniya, Sriganesh Lokanathan and Amal Kumarage (Phd) 13th International Conference of IFIP working group nd May 2015 This work was carried out with the aid of a grant from the International Development Research Centre, Canada and the Department for International Development UK..

Big data An all-encompassing term for any collection of data sets so large or complex that it becomes difficult to process using traditional data processing applications. The challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and privacy violations. Examples: –100 million Call Detail Records a day generated by Sri Lanka companies –45 Terabytes of data from Hubble Telescope 2

Why big data? Why now? Proximate causes –Increased “datafication”: Very large sets of schema-less (unstructured, but processable) data now available –Advances in memory technology: No longer is it necessary to archive most data and work with small subset –Advances in software: MapReduce, Hadoop 3

If we want comprehensive coverage of the population, what are the sources of big data in developing economies? Administrative data –E.g., digitized medical records, insurance records, tax records Commercial transactions (transaction-generated data) –E.g., Stock exchange data, bank transactions, credit card records, supermarket transactions connected by loyalty card number Sensors and tracking devices –E.g., road and traffic sensors, climate sensors, equipment & infrastructure sensors, mobile phones communicating with base stations, satellite/ GPS devices Online activities/ social media –E.g., online search activity, online page views, blogs/ FB/ twitter posts 4

Currently only mobile network big data has broad population coverage 5 Mobile SIMs/100Internet users/100Facebook users/100 Myanmar1314 Bangladesh6776 Pakistan70118 India71159 Sri Lanka Philippines Indonesia Thailand Source: ITU Measuring Information Society 2014; Facebook advantage portal

What is Mobile Network big data ? Visitor Location Registry (VLR) A recorded is generated every time a mobile phone comes within range and makes contact with a base station. No user intervention is required. Due to extremely large volumes Sri Lankan operators flush these records periodically. Call Detail Records (CDR) A record is generated every time an individual uses a mobile phone to receive or make a call, use the internet or send a text. Used by operators for billing purposes

Data used in the research Multiple mobile operators in Sri Lanka have provided four different types of meta-data – Call Detail Records (CDRs) Records of calls SMS Internet access – Airtime recharge records – No Visitor Location Register (VLR) data Data sets do not include any Personally Identifiable Information – All phone numbers are pseudonymized – LIRNEasia does not maintain any mappings of identifiers to original phone numbers Cover 50-60% of users; very high coverage in Western (where Colombo the capital city in located) & Northern (most affected by civil conflict) Provinces, based on correlation with census data 7

What does a CDR look like ? Call Direction Calling Party Number Called Party Number Cell IDCall TimeCall Duration 1A24BC1571XB321SG141X :42:14 00:03:35

Mobile network big data + other data  rich, timely insights 9 Mobile network big data (CDRs, Internet access usage, airtime recharge records) Mobile network big data (CDRs, Internet access usage, airtime recharge records) Construct behavioral variables (i)Mobility variables (ii)Social variables (iii)Consumption variables (i)Mobility variables (ii)Social variables (iii)Consumption variables Other data sources (i)Census data (ii)HIES data (iii)Survey maps (iv)Transportation schedules (v)++++ (i)Census data (ii)HIES data (iii)Survey maps (iv)Transportation schedules (v)++++ Insights (i)Urban & transportation planning (ii)Socio-economic monitoring (iii)Crisis management & DRR (iv)Health monitoring & planning (v)Financial inclusion (i)Urban & transportation planning (ii)Socio-economic monitoring (iii)Crisis management & DRR (iv)Health monitoring & planning (v)Financial inclusion Analytics

Congestion : an emerging issue in Sri Lanka Congestion is expensive Results in –Wasted fuel –Wasted time for commuters –Loss of productivity for businesses (delivery/ production) In 2011, the cost of congestion in the Western Province was approximately LKR 32 billion (USD 285 million), an average of LKR 10,000 per person per year –Source: Kumarage (2011) 9

Traditional Transport Forecasting Focus on data from travel and land use surveys and census A set of steps that converts this data into predictive transport models for forecasting using analytical methods (Four Step Approach) Four Step Approach Trip Generation Predict volume of trip generations and attractions at a traffic analysis zone. Trip Distribution Predict volume of trips between pairs of traffic analysis zones Mode Choice Predict the likelihoods of trips between locations being undertaken by different modes of travel such as private/public transport Route Assignment Assign estimated trips to the road network for traffic estimation.

MNBD has numerous advantages over traditional data collection Inexpensive Supports frequent updates to forecasts Greater spatial/temporal detail Capable of providing insights at multiple stages of traditonal forecasting Forecasts for areas currently not covered under traditional approaches at no additional cost

Estimating Origin-Destination Matrices with MNBD O-D matrices are the output for the “Trip Distribution” step (2 nd step) of the traditional forecasting approach. Matrix of person flows between traffic analysis zones. In this research we used two approaches for capturing human mobility with MNBD –Stay based approach –Transient approach

Stay based O-D estimation The daily trajectory of an individual can be imagined as composed of trips between locations where he/she is stationary for some meaningful amount of time. We use CDR records for each individual to identify stays consisting of a geographical location associated with a specific time period during which the individual was stationary

Stay based O-D estimation Cont.… In terms of the CDRs for an individual, a stay is identified by a continuous series of records such that, –Two contiguous records in the series are less than a distance D apart, where D = 1km. –Two contiguous records are separated by a time interval T Interval such that, 10 minutes ≤ T Interval ≤ 1 hour Each pair of consecutive stays for an individual for a day is considered the origin and destination of a trip.

Transient O-D estimation ●We identify individual “trips” by considering consecutive call/ GPRS/ SMS events with spatio-temporal constraints ●Trips are transient, i.e they don’t necessarily represent actual trips of a person, but at the very least capture segments 11

Transient O-D estimation ●A trip is, a pair of consecutive events where a displacement occurs for a user which is more than 10mins and less than 1hr apart ●This approximates actual trips ●This also minimizes false positives where consecutive events are served by different neighbouring towers when the user is stationary ●Trips are aggregated considering base stations as origins and destinations ●Trips are aggregated daily and hourly as Origin-Destination (OD) matrices 12

Stay based approach vs. Transient O-D approach Stay based approachTransient O-D Trips more aligned with actual travelTrips are transient, results more appropriate for flow analysis Tighter spatio-temporal constraints result in relatively small number of trips captured More mobility information extracted from MNBD Spatial constraints partially mitigates localization errors Results are subject localization errors, effective estimates for short trips

Validation Flows estimated based on 1 month of data for nearly 10 million SIMs were compared with then best available validation data (Trip generations for the Western Province) Adjusted R 2 – Stay based approach = 0.82 – Transient O-D approach = 0.85

Circadian Rhythm of transport demand Transient O-D approach Weekly

Circadian Rhythm of transport demand Daily Transient O-D approach

Mobility in the Western Province on a regular weekday

Mobility visualization for Colombo District identifies transport corridors Source: COMTRANS report,2013, Ministry of Transport 20 High Low

Transport forecasting with MNBD: challenges and possible solutions Localization errors due to same area being served by multiple base stations at different times –Decrease spatial resolution –Identify and filter unrealistic movement Data sparsity (90% of subscribers have less than 25 records daily) –Interpolation techniques –Probabilistic models of mobility (Ex : HMM) Variation of base station densities by region –Adjust resolution of analysis, limits type of analyses by region

Transport forecasting with MNBD: challenges and possible solutions Sampling bias : bias towards activity, mobile penetration – Adjusting for mobile penetration by using resident mobile users and census population – Survey relating mobility to calling behavior (Shibasaki lab. U of Tokyo)

Reproducibility MNBD is private data owned by operators who have concerns about sharing – competitive value – privacy of subscribers Data used in research –provided with strict non-disclosure conditions –No demographic information –Bare bones CDR In developed countries, –sharing/selling normalized aggregate mobility and other information (based on analysis) generated in house or an intermediary (Ex : Airsage)

Current and Future work Estimating O-D based on regular mobility Predicting travel motives using land usage data Understanding impact of shocks and crises on mobility Predicting infectious disease propagation