• Skip to primary navigation
  • Skip to main content
Logo

Work Zone Safety Information Clearinghouse

Library of Resources to Improve Roadway Work Zone Safety for All Roadway Users

  • About
  • Newsletter
  • Contact
  • X
  • Facebook
  • LinkedIn

  • Work Zone Data
    • At a Glance
    • National & State Traffic Data
    • Work Zone Traffic Crash Trends and Statistics
    • Worker Fatalities and Injuries at Road Construction Sites
  • Topics of Interest
    • Commercial Motor Vehicle Safety
    • Smart Work Zones
    • Work Zone Safety and MobilityTransportation Management Plans
    • Accommodating Pedestrians
    • Worker Safety and Welfare
    • Project Coordination in Work Zones
  • Training
    • Online Courses
    • FHWA Safety Grant Products
    • Toolboxes
    • Flagger
    • Certification and
      Accreditation
  • Work Zone Devices
  • Laws, Standards & Policies
  • Public Awareness
  • About
  • Events
  • Contact
  • Search
Publication

To Balance or Not to Balance? Applying a Machine Learning Technique to Oversample Severe Injury Crashes in Work Zones

Author/Presenter: Adeel, Muhammad; Khattak, Asad J.; Mishra, Sabyasachee; Thapa, Diwas
Abstract:

Road work zones (WZs) are increasingly common due to aging infrastructure and the need for capacity enhancement, presenting significant safety risks characterized by narrow lanes, uneven traffic flow, lower speeds, and reduced visibility. This study focuses on understanding the role of human behavioral factors in WZ crash injury severity and addressing the data imbalance caused by the lower incidence of high-cost fatal and serious injuries. A unique dataset comprising 7,855 WZ crashes in Tennessee from 2018 to 2022 was examined. The study applies the Synthetic Minority Over-sampling Technique (SMOTE) combined with a Random Forest (RF) model (a machine learning technique) to balance the dataset and improve prediction accuracy. Results indicate that aggressive driving, overspeeding, and drunk driving significantly escalate injury severity. Additionally, balancing the minority categories of crash injury severity levels (fatal and serious injuries) shifts the importance of contributing factors, emphasizing those more closely associated with higher injury categories. The application of SMOTE proved effective, significantly enhancing the prediction performance across various categories. The accuracy of the RF model improved from 71.88% to 74.36%, while the balanced accuracy increased substantially from 51.58% to 80.97%. These findings offer valuable insights for traffic safety engineers, transportation agencies, and policymakers to enhance WZ design and management. The study provides a framework for analyzing imbalanced crash data, highlighting critical behavioral factors, and recommending additional signage, speed limit reductions, and increased enforcement against unsafe driving behaviors. This approach aims to mitigate injury severity and improve road user safety in work zones.

Publisher: The University of Tennessee
Publication Date: 2025
Full Text URL: Link to URL
Publication Types: Books, Reports, Papers, and Research Articles
Topics: Crash Analysis; Crash Causes; Crash Data; Injury Severity; Machine Learning; Work Zones

Copyright © 2026 American Road & Transportation Builders Association (ARTBA). The National Work Zone Safety Information Clearinghouse is a project of the ARTBA Transportation Development Foundation. It is operated in cooperation with the U.S. Federal Highway Administration and Texas A&M Transportation Institute. | Copyright Statement · Privacy Policy · Disclaimer
American Road and Transportation Builders Association Transportation Development Foundation, American Road and Transportation Builders Association U.S. Department of Transportation Federal Highway Administration Texas A&M Transportation Institute