Predicting baseball pitcher efficacy using physical pitch characteristics

(1) Los Altos High School, (2) Brown University
Cover photo for Predicting baseball pitcher efficacy using physical pitch characteristics
Image credit: Antoine Schibler

The efficacy of baseball pitchers can be predicted from prior pitching data using machine learning (ML) models. Previous ML studies relating to baseball have primarily involved predicting outcomes of baseball games and a thrown pitch. This paper is the first work that uses 16 game-independent features, which describe a pitcher’s set of thrown pitches, to predict pitcher efficacy metrics, like walks/hits allowed per inning (WHIP), batting average against (BAA), and fielding independent pitching (FIP). We hypothesized that these 16 “physical features,” measured by sensors, can explain greater than 50% of the variance while predicting pitcher efficacy. We applied neural network (NN) models to predict the efficacy metrics using all 16 features, while we used linear regression (LR) models to analyze the individual impact of each feature for predicting the efficacy metrics. We observed from the NN and LR models that the “ballFrequency” feature was the most impactful in predicting the WHIP for any pitcher. For the BAA and FIP metrics, the LR models showed that none of the features, including the pitch velocity and types of pitches thrown, were statistically significant; however, our NN model did improve the prediction of the BAA and FIP metrics. Based on our evaluations, the ML models could not prove our hypothesis, as the results accounted for less than 50% of the variance when predicting the pitcher efficacy metrics. Professional scouts can still use the results of our feature analysis to select better pitchers who have never played a game at the professional level.

Download Full Article as PDF

This article has been tagged with: