I have developed several tools related to baseball data and analysis that I have made available and maintain for public use. Updates and tips for these tools are shared through the BaseballTools Twitter account.
baseballr package for R: baseballr is a package written for the R programming language focused on baseball analysis. It includes functions for scraping various data from websites, such as FanGraphs.com, Baseball-Reference.com, and baseballsavant.com. Sample functions include the ability to scrape MLB and NCAA pitcher and hitter data over custom ranges, calculate pitch tendencies to different parts of the strike zone, and calculate advanced metrics for user data sets.
Interactive Spray Chart Tool: An interactive spray chart tool built with R Shiny that uses batted ball data from MLBAM’s Gameday and Statcast systems. Data and tool is updated daily in season. Entire pipeline is written in R.
Edge% app: An interactive web app built in R with Shiny. Data includes frequency of pitches to different areas of the strike zone for batters, pitchers, and teams since 2010. Also includes umpire called strike tendencies by area of the strike zone. The app updates daily through an R job, adding data from all games the night before during the season.
Baseball Tools YouTube Channel: Includes code-through videos featuring various functions from baseballr and other tools.
Hitter Volatility Leaderboards: Leaderboards for Hitter Volatility (VOL) and Corrected Volatility (corrVOL) for individual seasons from 1974-present.
Cumulative Home Run Record Holders: An interactive web app built in R with Shiny. Displays hitters that reached different home run totals the fastest based on number of plate appearances from the start of their careers.
MLB Player Season Similarity Tool: An interactive web app built in R with Shiny. Allows users to compare individual player seasons to others. Smaller the Similarity score, the more similar player seasons are. Data available for hitters (>= 300 PAs) and starting pitchers. Graphs can be downloaded.
Umpire IDs for Games since 2008: A file with umpire IDs, names, game dates, and game_pk variables. Users can use this file to match in umpire information to BaseballSavant download files. Source is MLBAM GameDay xml files. Direct download link.
Game Day Supplemental Data File: A file with supplemental information for MLB games (2008-present) from Game Day xml files that includes elements such as start times, elapsed time, weather, etc., that can easily be merged with Statcast Search download files from BaseballSavant. File will update daily. Direct download link.
Pitch Tracking Era Player Info: A file with player biographical information, including height, weight, dates of draft and debut, etc., that updates daily during the season. Direct download link.
MiLB Park Factors: Data can be downloaded as a csv.
MiLB Game Supplemental Data File: A file with supplemental information for MiLB games across various levels (2010-present) from the MLB Stats api. Direct download link.
MLB Pitch-level Linear Weights 2008-2019: A file containing pitch-level linear weights for events calculated by year for 2008-2019. This linear weights are based on Statcast data and calculated with the baseballr package. Direct download link.
Statcast-PITCHf/x Working Glossary: A working glossary of terms and variables associated with public PITCHf/x and Statcast data. Definitions come from multiuple sources over time and the variables generally align to those available through downloadable files on BaseballSavant.
MLB Game Feed Viewer: An interactive app that allows users to view pitch-by-pitch data live for a specified MLB game. Data is parsed and delivered from the MLB Stats API. Selected pitches and batted balls can be visualized side-by-side. Data and plots can also be downloaded.