Presentation on theme: "Statewide Real-time Data Hub Update Presented by Marullus Williams"— Presentation transcript:
1 Statewide Real-time Data Hub Update Presented by Marullus Williams April19, 2012
2 BackgroundThe transit community in Virginia is looking at transit traveler information and is discussing standards, trends and applicationsITS Virginia along with the Virginia Department of Rail and Public Transportation (DRPT) led an effort to create a technology community for transit operators statewideTo accomplish the task, a working group was formed to;Discuss, develop and promote the use of transit technology standards,Act as a resource for the exchange of ideas and general technology discourse,Promote the systems engineering process for the development, procurement and deployment of transit ITS projectsThe working group is an open group for anyone interested in transit technology
3 2011 ITS Program Update Survey *Survey conducted in 2010 and published in 2011 ITS Program Update - ITS participants rated the needs of transit operators
4 Background Transit Traveler Information is primarily in two forms Static Data – most static transit data is now available in electronic form provided in;Trip planners via the web, andIn standard formats like GTFSReal-time Information – some larger transit providers are now providing in;Various form via web tools and applicationsIn various formatsThere are nearly 30 transit operators supported by State funds in VirginiaThere are over a dozen urban, small urban and rural transit agencies engaged in this traveler information discussion in VirginiaVirginia has an active community discussing transit ITS issues through the support of ITSVA and the Commonwealth
5 ApproachThe working group is interested in making real-time and historical data available to the public and to 3rd party developers in order toImprove passenger information,Improve government transparency, andImprove multimodal transportation optionsA reasonable approach is to use the standards working group to define Virginia transit traveler information goals and leverage the work and approach undertaken by WMATA, Blacksburg and other national leadersThe potential benefits of this approach include;Strengthening of standards-based sharing,Out of the box interoperability, andCost efficiencies to agencies by leveraging the existing investment
6 ProgressThe working group has met numerous times over a number of months to discussAgreements and potential public policyStandard formats, including the possible creation of a standard real-time data format for Virginia transit agencies to followLead and participating agenciesHosting locations for static and real-time dataLead efforts;Recently, WMATA released real-time and historical data to the public through the use of very inexpensive, commercially available, cloud computing technologies through a standards based application programming interface (API). This API has been extremely well received by the transit, software, and passenger communitiesBlacksburg Transit in the Virginia Tech community has also provided their real-time transit data in open APIIn August 2011, DRPT sponsored the development of a ConOps to guide the development of the Statewide Real-time Data Hub.The ConOps was completed November 2011.Future implementation plans are yet to be determined.
8 ConOps Study Team Washington, DC –based SIRC created the ConOps Team Members:Jamey Harvey, Subject Matter ExpertMarullus Williams, Project ManagerKunmi Ayanbule, Technical Architect
9 Important NotesThese recommendations are currently being reviewed by DRPTNo decisions have been made on how to proceed with these recommendationsThis presentation is only an update on the findings presented to DRPT by the ConOps Team
10 The Regional API Concept Needed to decide on the approach to the “Regional” API.Is the goal to facilitate each transit agency’s ability to publish its own standard’s based API, or is the idea to have all of the regional data fed into a regional API?If each local agency publishes its own API, the regional responsibility would be to ensure that each agency is truly standards-based and interoperable. There would be a regional directory of agency API feeds, but each agency would be responsible for building and maintaining its own API.There could be a hybrid approach in which the Regional API would aggregate data from each local API. The Regional API would in essence become a consumer of each local API’s data and in turn provide that information to the public8/8/2011
11 Four Regional Approaches Local Transit Agency ALocal Transit Agency BLocal Transit Agency ALocal Transit Agency BAPIAPI3: All data aggregated and served at regional level2: Local API to Regional APIRegional Directory of API FeedsRegional Data Store and API4: Hybrid approach of 1-31: Direct from Local API to Developer with Regional DirectoryDeveloper Community / Public Data Consumers / Transit Agencies8/8/2011
12 Stakeholder Interview Topics Describe your agency’s organizational structure.Provide details on your transportation system.Modes of travelTypes of service schedulesNumber of ridersNumber and types of incidentsGeographic rangeInterconnections with other systemsDescribe your current IT environment.Describe your current transportation technology.What data does your organization currently make available to the public (including paper-based info)?Which data elements would be easiest to provide to the public via the API?What is your role in providing real time data for the API?8/8/2011
13 Agencies Interviewed Fredericksburg: Arnold Levine PRTC/Omniride: Doris Chism, Ryan Jones and Eric MarxWilliamsburg/James-City: Kevan DankerBlacksburg: Tim Witten and Aneil SamuelArlington: Bee Buergler and Tom SchererVDOT: Scott Cowherd and Noah GoodallLoudoun County: Scott GrossUniversity of MD (RITIS): Michael Pack
14 Key Interview Findings All local agencies have data available to create static GTFS feedsWith the exception of WMATA and Hampton Roads, all agencies have bus service only.Most agencies do not have dedicated information technology departments. They are heavily dependent upon city/county IT resources or outside contractorsMany agencies have either recently procured, currently evaluating or soon issuing RFPs for AVL technologyAll agencies are interested in participating in the real time APIThe three most critical issues facing local agencies in providing real time data:Integrating information from the many disparate transit systems that are in place within each agencyEncouraging vendors to provide data in an open, standards-based formatObtaining technical help given the lack of IT resources within most transit agenciesAgencies need guidance from the real time API team on how to ensure AVL vendors provide data in the proper formatReal-time and static data collection regionally is needed as much for transit planning purposes as for creation of public-facing applications. The scope of this project is to develop the real time API, not the data warehouse specificationsRITIS is an important stakeholder in the API development. Most agencies underscored the importance of ensuring that providing data to the API and RITIS are as similar as possible
18 Rolling Out a Successful API Project Agreeing to data sets to be publishedImplementing a standards-based approachConnecting all required data elements to the APICreating a fast, reliable infrastructure by leveraging cloud services and API-specific solutions like MasheryPublicizing the APICommunicate regularly with the developer communityBuilding an API forum / community using tools such as Facebook, TwitterManaging updates to the API. Good documentation is key.Identifying and managing all legal, policy and security risks.Monitoring the use of transit data by developers and the public.
19 Local Agency Data Collection It is the responsibility of the participating local agencies to integrate the required data and provide a location (or locations) within each agency’s infrastructure for retrieving the data required for the API.In order for the data to be collected consistently and uniformly from each local agency, it is important that all local data be formatted as defined in the API specification.The data must be made available via csv files, xml files or Excel spreadsheets. Depending on the type of data that is contained for each file, the data will be updated by the local transit agency and provided to the VTA at varying frequencies.The data retrieval layer will be built within the VTA infrastructure (whether cloud or on-premises).In order to support all agencies with varying technology infrastructures, the data retrieval layer will offer a push and pull service.
21 Database Considerations Data from the local transit agencies must be stored in the VTA database. The ConOps OV-1 illustrates the need for Data Translation and Integration to accommodate any semantic or syntactical differences in data collected from the regional transit agencies. The VTA Database is intended to be a temporary storage with current data and limited historical data. For example, the Database can keep four hours of transit data after which that data will be pushed to a data warehouse.The Database will be based on a real-time database systems or an in-memory data-store. To improve scalability, several traffic management, rate-limiting and smart time-sensitive data caching strategies will be implemented. Caching will reduce the latency between HTTP requests to the application server and the fetching of data feeds
23 API Assembly LayerThe Feed Assembly Layer packages data that will be provided to Data Consumers. The interface to this layer will be HTTP- based REST protocol, which will respond in one of the supported output formats, SIRI and GTFS/GTFS-RT.This layer will have specific modules for converting to XML, Protobuf and JSON formats depending on the request. Protocol buffers (Protobuf) is a binary format used by GTFS- RT and is a flexible, efficient, automated mechanism for serializing structured data.It is smaller, faster, and simpler than XML. JSON is also a small footprint format that is simpler, less verbose than XML and widely used by application developers.JSON is not natively supported by SIRI and GTFS-RT, but the API Assembly Layer will be able to produce JSON formatted responses based on the structure of GTFS-RT.
24 API ManagementAn API Management tool like Mashery would provide the following benefits to the API:Eliminates the need to internally develop API gatekeeping functionalityWell-supported and currently employed by WMATA, Best Buy, Netflix, Cnet and others to support publication of APIs for third-party developer useProvides API registration, access and self-service provisioningProvides key issuance and credential managementAllows usage control: throttling and limiting tied to key, user, method or groupCaches frequently used callsSupports business rules configuration based on filters, parameters, and methodsProvides real-time insight to all activity and data export available for independent analysisProvides reports that measure uptime, track errors, and show cache activityProvides API usage information including call volumes, top method calls, and top user activityIncludes content management, versioning and documentation change control
25 PortalThe Portal must provide information and documentation for Data ConsumersThe term “Data Consumers” refers to computer applications (and the users of those applications) that retrieve data via the VTA. The most popular applications that will use the API data can generally be divided into the following two types:Traveler Applications built for desktop, web and mobile platformsTransit Agency Operations and Planning Applications that leverage the data to improve safety, efficiency, and customer satisfaction of transit operations
26 Third Party Developer Portal The User Community400+ developers have registered380,000+ successful API calls per week
27 WMATA Signboard Example The User Community – Window Unit
28 Data Set DefinitionStandard data sets foster subsystem and multi-agency communicationProprietary formats can be restrictive or cost prohibitive to convert to a non-proprietary formatThe national trend is for transit agencies and others to make static and real-time information openly available to developers at no chargeInformation clearinghouses like Regional Integrated Transportation Information System (RITIS) and VA 511 can also be data receiversGoogle transit information data standard, general transit feed specification (GTFS) has emerged as a national standard for static information and for the most part is the standard in VirginiaReal-time data standards have yet to formally emergeThe working group reviewed local existing data formats including;Washington Metropolitan Transit Authority (WMATA) real-time data format,SIRI – transit-specific, highly extensible, andVirginia Tech Bus Tracker
29 GTFSGTFS transit feed specification defines a common format for public transportation schedules and associated geographic information. GTFS- RT is a feed specification that allows public transportation agencies to provide real-time updates about their fleet to application developers.GTFS Advantages:Supported by Google. Google provides significant marketing resources for publicizing the availability of agencies’’ GTFS data feeds. Easy for agencies to adopt standard and quickly display data via the popular Google Maps service.Robust online documentation and forums to provide support to transit agenciesFree to connect to GTFSMany transit technology vendors have adopted GTFSThere is a large community of developers familiar with Google’s API specificationsGTFS Disadvantages:Completely dependent upon Google’s support; if Google ceases support for GTFS, the standard would be in jeopardy of obsolescenceGoogle does not provide access to raw data that it collects from agenciesMust agree to Google’s inflexible legal terms regarding indemnification
30 SIRISIRI is managed by a CEN Working Group - TC278 WG3 SG7. SIRI allows pairs of server computers to exchange structured real-time information about schedules, vehicles, and connections, together with general informational messages related to the operation of the services. The information can be used for many different purposes, for example:To provide real time-departure from stop information for display on stops, internet and mobile delivery systems.To provide real-time progress information about individual vehicles.To manage the movement of buses roaming between areas covered by different servers.To manage the synchronization of guaranteed connections between fetcher and feeder services.To exchange planned and real-time timetable updates.To distribute status messages about the operation of the services.To provide performance information to operational history and other management systemsSIRI Advantages:Vendor-neutral standardSupports significantly more data elements than GTFSWidely used InternationallyExtensible; agencies can create their own custom data fieldsSIRI Disadvantages:Complex to implementNot used as much in the US as in Europe
31 Proposed API Technical Specification The API will provide data access via three interfaces: SIRI, GTFS RT and GTFS.Only data elements that are part of a standard can be delivered via that standard’s interface.The goal is to have SIRI provide access to all data elements.Mode of Transportation (Bus, Rail)Information Type (Static, Real-time, Support)Data Category (Groups similar information, e.g., Agency Information, Stop Information, Route Information)Data Element (Defines individual data elements available via the API)The following information is provided for each VTA Data Element:VTA Name: The unique name assigned by VTA for each Data Element. Participating local agencies will provide data to the API using the VTA names.Description: Explains the information providedVTA Data Type: The data type required by VTA for local agencies to provide the Data ElementTransmodel/SIRI Equivalent: The SIRI name that Data Consumers will use to access the Data ElementTransmodel / SIRI Module Source: The SIRI module in which Data Consumers will find the Data ElementGTFS-RT Equivalent: The GTFS-RT name that Data Consumers will use to access the Data ElementGTFS-RT Module Source: The GTFS-RT module in which Data Consumers will find the Data ElementGTFS Equivalent: The GTFS name that Data Consumers will use to access the Data ElementGTFS Module Source: The GTFS module in which Data Consumers will find the Data Element
32 Mode Static / RT Datapoint Bus and Rail Static Agency Information DatapointBus and RailStaticAgency InformationAgency IdentifierAgency NameAgency URLTimezoneFare URLStop InformationStop identifierStop codeName of stopStop descriptionLatitudeLongitudeZone identifierStop URLType of locationParent station identifierRoute InformationRoute IdentifierShort nameLong nameDescriptionType of routeTrip InformationService IdentifierTrip identifierHeadsign textDirectionBlockShapeStop timesArrival timeDeparture timeStop sequencePick off typeDrop off typeShape distance traveled
33 Data-pointData TypeTransModel/ SIRI EquivalentTransModel/SIRI Module SourceGTFS-RT EquivalentGTFS-RT Module SourceGTFS EquivalentGTFS Module SourceDescriptionAgency Infor-mationAgency Identifieruint64authorityIDAuthority-agency_idAgencyThis field is an ID that uniquely identifies a transit agency.Agency NameStringauthorityNameagency_nameThe full name of the transit agencyAgency URLauthorityURLagency_urlThis field contains the URL of the transit agency. Example:TimezoneauthorityTimezoneagency_timezoneThe timezone where the transit agency is located. Example: UTC+02LanguageauthorityLangagency_langThis field contains a two-letter ISO code for the primary language used by this transit agency. Example: ENPhone NumberauthorityPhoneagency_phoneThe agency's phone numberFare URLagency_fare_urlThis specifies the URL of a web page that allows a rider to purchase tickets or other fare instruments for that agency online
34 Future Considerations Finalize phased plan for rollout of the real time data hub.Who will build and manage the infrastructure?What type of governance will be implemented?How will local agencies obtain the funding and technical support required to connect to the data hub?