UK Road Safety Analysis

Project Introduction

This project analyzes circumstances of personal injury road accidents in Great Britain in 2012, analyzing nearly 150,000 incidents.

The statistics relate only to personal injury accidents on public roads that are reported to the police, and subsequently recorded, using the STATS19 accident reporting form. Information on damage-only accidents, with no human casualties or accidents on private roads or car parks is not included in this data.

Figures for deaths refer to persons killed immediately or who died within 30 days of the accident. This is the usual international definition, adopted by the Vienna Convention in 1968.


  • Features Engineering: Create new variables, including morning, afternoon, evening, and night; commute and noncommute; weekend vs weekday; high and low severity and fatal and nonfatal severity.
  • Find most signicant variables using Forward Subset Selection, and lasso.
  • Use trees and randomForests to identify series of variables in order to make recommendations
  • Use GIS mapping to identify geographic areas of higher concern

Preliminary Visualization of the Data

We can see that day of the week, and/or weekend versus weekday may be significant.

Similarly, we see a difference between time of day and commute times for fatal versus non-fatal crashes. names(road)