Photo by Oran Viriyincy (CC-BY-NC-ND)
When a traffic collision occurs, the lives and families of those involved are changed forever by what follows - injuries, medical bills, property damage, insurance claims or, in the worst cases, loss of life. While Seattle has seen a 30 percent decline in traffic fatalities over the last decade, traffic collisions are still a leading cause of death for Seattle residents age 5-24.
The city has made a commitment to Vision Zero, a global movement to reduce traffic fatalities and severe injuries to zero, and is now looking to determine the most effective policies and interventions to make streets safer. Building off our recent work with New York City in partnership with Microsoft Tech and Civic Engagement, we’re conducting an in-depth pedestrian and bicyclist safety study for the City of Seattle to empower local decision makers with potentially life-saving information.
To lay the groundwork for this study, over 20 data scientists volunteered at a DataDive in May sponsored by Microsoft and hosted by University of Washington’s eScience Institute. Using open collision, roadway and land use data, the volunteers worked with representatives from Seattle’s Department of Transportation to complete exploratory data analysis and modeling to estimate volume of vehicles on roads, identify collision hotspots and explore predictors of future collisions.
Out of 10 models developed, this was one of the best performing in estimating traffic volume by street, a critical variable in many traffic studies, both at the DataDive and generally. This view shows the difference between the model’s predictions and actual measurements.
To determine how safe a given location in the city is for pedestrians and cyclists, it’s important to understand the relationship between crashes and the number of automobiles present at specific locations. Because it would be very expensive to measure exactly how many vehicles pass by each street, we can get a pretty good estimate by creating models that use existing measurements and other built environment and road network characteristics to predict it.
The volunteers created 10 different models for estimating vehicle volumes, or “exposure,” using several different modeling techniques. Above you can see the output of one of the best performing models - a random forest regression. Being able to accurately estimate traffic volume on each street will not only inform our analysis in Seattle, but is applicable and hugely valuable for our work in other cities nationwide as well.
Being able to determine the probability of crash, even at locations where a crash has yet to occur, can help policy makers and engineers better target their efforts. Using collision data, the estimated vehicle volumes calculated above, road network characteristics and built environment characteristics, the volunteers created heatmaps showing where their models predict collisions are most likely to occur. These initial models will be further refined for local decision makers to use.
By looking at specific characteristics of past crashes like the time of day they occurred, the weather at the time, lighting conditions and other behavioral factors, the volunteers sought to identify which variables are the greatest predictors of collisions of varying severity. This information could be useful from both engineering and enforcement perspectives highlighting which locations and situations could benefit from increased attention.
Exploratory analysis of relationship between road conditions and the number and severity of collisions. Note: this raw data does not take into account exposure by road condition.
This is of course only the beginning of our work in Seattle and, thanks to the DataDive volunteers that joined us and Microsoft’s support, we have hit the ground running.
A big huge thank you! The DataDive was an awesome event and we really enjoyed working with your team and the amazing group of volunteers. We’re really looking forward to the work ahead.
-Jim Curtin, Seattle Department of Transportation
We couldn’t have said it better ourselves. We were blown away by the talent and commitment of the volunteers that helped out, bringing perspectives from wide ranging fields like real estate, bioinformatics, transportation engineering, astronomy/physics, medicine, machine learning and psychology. The range of their approaches that were applied and adapted have saved us an enormous amount of time as we begin work in Seattle and has given us a solid foundation for our analysis in Seattle and nationwide.