Developing Models for Predictive Analytics in Track and Field
Track and field is a heavily statistics based sport as the objective nature of each event allows for simple comparisons of results. While these basic performance statistics have long been in use to compare the caliber of athletes, there has not been much uptake in applying advanced analytics to the sport to improve ranking systems and race strategies. Even when advanced analytics strategies have been applied to track and field, they often use inaccessible data, making the models impractical for use at an average track meet. The objective of this paper is to develop and apply various statistical and machine learning models to problems in track and field, particularly in the outdoor running events, in order to understand the sport from an advanced analytics perspective. To ensure practicality of these applications to track and field meets, only readily available results information, such as athletes' prior result times and weather data, is used throughout the analyses. Using this data, various models are developed in order to establish comparisons between athletes prior to, during, and after their races. A first model is proposed to normalize competition results in terms of weather and race competitiveness to better compare performances across race environments. Another model that predicts the result time of an athlete is trained to allow officials to competitively distribute runners across the heats of an event when seeding races. A third model leverages the positioning of athletes during each lap of distance races to predict the final placement of athletes in a race before the race finishes. Finally, the 4x100 meter relay is mathematically modeled to assist a coach in determining which individuals to choose from a pool of athletes to comprise the fastest team and order.