Presentation is loading. Please wait.

Presentation is loading. Please wait.

Artificial Realistic Data (ARD)

Similar presentations


Presentation on theme: "Artificial Realistic Data (ARD)"— Presentation transcript:

1 Artificial Realistic Data (ARD)
Daniel Carter UNC Highway Safety Research Center Presented at the Traffic Records Forum, August 6, 2017 New Orleans, LA

2 Background Accurate Crash Modification Factors (CMFs) are very important in roadway safety programs Used in determining whether or not to apply a given treatment to a given roadway segment or location What is a CMF? A multiplicative factor used to compute the expected number of crashes after implementing a given countermeasure at a specific site. CMF for treatment = 0.8 Expected crashes at site without treatment = 100 Expected crashes at site with treatment = 100 x 0.8 = 80

3 Difficulties in Developing CMFs
Before/after Cross sectional

4 Difficulties in Developing CMFs
Before-after studies: Generally agreed to provide reliable CMFs Cross-sectional studies Use regression models: y = exp(bo + b1Lanes + b2SW + b3MW +…) Good model will predict outcome (“y”) well But can each coefficient be used to develop a good CMF? Unfortunately, “Maybe/Probably not.” And we don’t know how good it is! We have no knowledge of “underlying true relationships” – no “truth” to compare to.

5 Solution – ARD! Build a dataset where “underlying true relationships” are known (i.e., built into the dataset) Keep underlying relationships secret so that users don’t know what “truth” is Build in randomness and realistic confounders Researchers apply cross-sectional methods using the dataset and submit their results to be graded How well does the method capture “underlying truth”? Which method(s) are better?

6 Solution – ARD! Thus, overall goal is to improve cross-sectional methods so that more reliable CMFs can be obtained The ARD is a tool (database) that, when used by analysts, will help us reach this goal

7 What can the ARD do? ARD can be used to determine which cross- sectional procedures best define the relationship between a given roadway descriptor and the level of safety Example: Which statistical technique best defines the effect on crash risk of increasing lane width?

8 What can the ARD NOT do? ARD cannot be used to learn more about the actual relationship between a roadway feature and the level of safety Example: What is the effect on crash frequency of increasing lane width from 11 to 12 feet?

9 In other words, not so much this…
Source: National Park Service

10 …as this Source:

11 ARD Components Roadway generator Crash generator
Generates homogeneous road segments with realistic characteristics Crash generator Embedded (secret) causal “rules” defining realistic assumptions about relationships between roadway characteristics and crashes Computerized system to put the two together Generate roadway file, attach crash counts to each segment, send combined file to user

12 ARD Components (cont.) Method grading procedure Pilot test
To determine how well the results capture the “true” relationships Pilot test To see if the computerized ARD system works and if common statistical approaches can be developed and graded

13 Roadway Generator 2400 miles of 0.02 mile segments
Based on data from Washington DOT (HSIS) inventory system plus additional data from other sources usRAP database Reviews of Google Earth® data Information from AASHTO Green Book Weather data from NOAA Speed relationships in the Highway Capacity Manual

14 Roadway Generator (cont.)
Assignment of a characteristic in ARD is often based on Markov Chain principle e.g., shoulder width on a given segment has some (high) probability of being the same as on the preceding segment, but there is also some (lower) probability that it will change. Bottom line – a realistic set of characteristics was generated Source: Florida DOT

15 Crash Generator Estimates crashes for four basic crash types
“Genuine” single-vehicle (drift-from-lane) crashes Single-vehicle crashes preceded by a multivehicle conflict Multi-vehicle same-direction crashes (e.g., rear-end) Multi-vehicle opposite-direction crashes (e.g., head-on) Randomness in crashes per segment generated for each user’s data

16 How Are Crash Counts Generated?
For each crash type, a set of (secret) causal “rules” was defined, specifying how each roadway descriptor would affect a given crash type E.g., how would driveway count affect the probability of a genuine SV crash? Crash generator produces estimates of average expected crashes given the combination of roadway factors on a segment

17 Example of Crash Generation
Crash Type: Drift-from-lane Single Vehicle Crashes Crash generator applies one of many causal “rules” (embedded realistic assumptions) More Driveways More Driver Attentiveness Lower Probability of Drifting From Lane Fewer Drift-from-Lane Crashes

18 Another Example Crash Type: Conflict-related Single Vehicle Crashes
Different sets of causal rules were developed for boxes A, B, D, and F. In summary, ARD is a very complex system. But based on realistic assumptions (causal rules).

19 Method Grading Procedure
Compare the results from the analysis with the safety relationships that were input into the system Degree of success would be judged by summary statistics such as ‘bias’ or ‘variance’

20 Pilot Test One experienced modeler will use the 2,400 miles of data to develop regression models relating Genuine SV crashes to roadway characteristics Their goal is to develop CMFs for lane and shoulder width They are being asked to build at least one model The model results will be graded

21 Possible Future Efforts?
Additional pilot testing using genuine SV crashes. Additional modeling with other crash types Combination of modeling results into published information on “best cross-sectional procedures” (e.g., “best modeling practices”) Expansion of ARD into other crash types (angle crashes) on multi-lane roads or at intersections… Develop ARD for before-after study designs

22 Research Team Co-PIs – Forrest Council and Raghavan Srinivasan (UNC HSRC) Dr. Ezra Hauer (Univ of Toronto, retired) – Safety relationships Doug Harwood (MRIGlobal) – Inventory data Dr. Bo Lan (UNC HSRC) – SAS database Anushapatel Nujjetty (LENDIS, HSIS) – data support David Harkey (UNC HSRC) – Technical inputs

23 Questions and Discussion Contact: Raghavan Srinivasan srini@hsrc. unc
Questions and Discussion Contact: Raghavan Srinivasan


Download ppt "Artificial Realistic Data (ARD)"

Similar presentations


Ads by Google