Skip to contents

Based on a series of heuristics, this function attempts to label Statcast data for which the launch angle and speed have been imputed.

Usage

label_statcast_imputed_data(
  statcast_df,
  impute_file = NULL,
  inverse_precision = 10000
)

Arguments

statcast_df

A dataframe containing Statcast batted ball data

impute_file

A CSV file giving the launch angle, launch speed, bb_type, events fields to label as imputed. if NULL then it's read from the extdata folder of the package.

inverse_precision

inverse of how many digits to truncate the launch angle and speed to for comparison. Default is 10000, i.e. keep 4 digits of precision.

Value

A copy of the input dataframe with a new column imputed appended. imputed is 1 if launch angle and launch speed are likely imputed, 0 otherwise. Returns a tibble with the following columns:

col_nametypes
pitch_typecharacter
game_dateDate
release_speednumeric
release_pos_xnumeric
release_pos_znumeric
player_namecharacter
batternumeric
pitchernumeric
eventscharacter
descriptioncharacter
spin_dirlogical
spin_rate_deprecatedlogical
break_angle_deprecatedlogical
break_length_deprecatedlogical
zonenumeric
descharacter
game_typecharacter
standcharacter
p_throwscharacter
home_teamcharacter
away_teamcharacter
typecharacter
hit_locationinteger
bb_typecharacter
ballsinteger
strikesinteger
game_yearinteger
pfx_xnumeric
pfx_znumeric
plate_xnumeric
plate_znumeric
on_3bnumeric
on_2bnumeric
on_1bnumeric
outs_when_upinteger
inningnumeric
inning_topbotcharacter
hc_xnumeric
hc_ynumeric
tfs_deprecatedlogical
tfs_zulu_deprecatedlogical
fielder_2numeric
umpirelogical
sv_idlogical
vx0numeric
vy0numeric
vz0numeric
axnumeric
aynumeric
aznumeric
sz_topnumeric
sz_botnumeric
hit_distance_scnumeric
launch_speednumeric
launch_anglenumeric
effective_speednumeric
release_spin_ratenumeric
release_extensionnumeric
game_pknumeric
pitcher_1numeric
fielder_2_1numeric
fielder_3numeric
fielder_4numeric
fielder_5numeric
fielder_6numeric
fielder_7numeric
fielder_8numeric
fielder_9numeric
release_pos_ynumeric
estimated_ba_using_speedanglenumeric
estimated_woba_using_speedanglenumeric
woba_valuenumeric
woba_denominteger
babip_valueinteger
iso_valueinteger
launch_speed_angleinteger
at_bat_numbernumeric
pitch_numbernumeric
pitch_namecharacter
home_scorenumeric
away_scorenumeric
bat_scorenumeric
fld_scorenumeric
post_away_scorenumeric
post_home_scorenumeric
post_bat_scorenumeric
post_fld_scorenumeric
if_fielding_alignmentcharacter
of_fielding_alignmentcharacter
spin_axisnumeric
delta_home_win_expnumeric
delta_run_expnumeric
ilainteger
ilsinteger
imputednumeric

Examples

# \donttest{
  try({
    statcast_df <- statcast_search("2017-05-01", "2017-05-02")
    sc_df <- label_statcast_imputed_data(statcast_df)
    mean(sc_df$imputed)
  })
#> [1] 0.003561888
# }