-
-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathsports-cognitive-psychology.qmd
149 lines (126 loc) · 5.59 KB
/
sports-cognitive-psychology.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
# Sports and Cognitive Psychology {#sec-sportsCognitivePsychology}
## Getting Started {#sec-sportsCognitivePsychologyGettingStarted}
### Load Packages {#sec-sportsCognitivePsychologyLoadPackages}
```{r}
library("nflreadr")
library("tidyverse")
library("ggtext")
```
### Download Football Data {#sec-sportsCognitivePsychologyDownloadFootballData}
```{r}
#| eval: false
#| include: false
load(file = file.path(path, "/OneDrive - University of Iowa/Teaching/Courses/Fantasy Football/Data/nfl_pbp.RData", fsep = ""))
load(file = file.path(path, "/OneDrive - University of Iowa/Teaching/Courses/Fantasy Football/Data/nfl_4thdown.RData", fsep = ""))
```
```{r}
load(file = "./data/nfl_pbp.RData")
load(file = "./data/nfl_4thdown.RData")
```
## Overview {#sec-sportsCognitivePsychologyOverview}
<https://stacker.com/sports/15-ways-analytics-has-changed-sports> (archived at <https://perma.cc/PQ5R-TWFA>)
- Bill James, "Moneyball", Billy Beane, Theo Epstein, and role of Sabermetrics in helping MLB teams with limited budgets (e.g., Oakland Athletics) compete with larger budgets: using statistics to identify player value more accurately, especially for identifying undervalued players
- development of advanced metrics (e.g., value over replacement player, wins above replacement, fielding independent pitching)
- focus on on-base percentage and slugging percentage (and on-base plus slugging percentage) instead of batting average; defensive shifting; fewer attempts to steal bases; fewer bunts; batters taking more pitches; more frequent pitching changes (for pitcher–batter matchups, e.g., righty–lefty); focus on velocity and spin rate
- other sports, including basketball and football are making more use of analytics
- basketball: types of shots (focusing on three pointers and layups); star players resting more games
- football: going for it on fourth down; greater emphasis on the passing game; drafting Running Backs later in the draft (and, overall, valuing Running Backs less); trading down in the draft for more low picks than fewer high picks [because top picks are frequently overvalued; @Massey2013]
## Coaching and Risk Aversion {#sec-coachingRiskAversion}
```{r}
nfl_pbp4thDown <- nfl_pbp %>%
filter(down == 4) %>%
filter(!(play_type %in% c("no_play","qb_kneel")))
nfl_pbp4thDown$goForIt <- NA
nfl_pbp4thDown$goForIt[which(nfl_pbp4thDown$play_type %in% c("field_goal","punt"))] <- 0
nfl_pbp4thDown$goForIt[which(nfl_pbp4thDown$play_type %in% c("pass","run"))] <- 1
nfl_pbp4thDownPlotData <- nfl_pbp4thDown %>%
filter(!is.na(goForIt)) %>%
group_by(season) %>%
summarise(
goForItPct = mean(goForIt, na.rm = TRUE),
n = n(),
sd = sd(goForIt),
se = sd / n
)
ggplot2::ggplot(
data = nfl_pbp4thDownPlotData,
ggplot2::aes(
x = season,
y = goForItPct)) +
geom_point() +
geom_line() +
geom_ribbon(
aes(
y = goForItPct,
ymin = goForItPct - qnorm(0.975)*se,
ymax = goForItPct + qnorm(0.975)*se),
alpha = 0.2) +
scale_y_continuous(
limits = c(0, NA)
) +
ggplot2::labs(
x = "Season",
y = "Proportion of 4th Down Plays that are Attempts (to Get the First Down)",
title = "4th Down Attempts (Proportion) by Season",
) +
ggplot2::theme_classic()
```
Adapted from Ben Baldwin: https://www.nfl4th.com/articles/4th-down-research.html
```{r}
# labels on the plot
text_df <- tibble(
label = c("NFL coaches<br>in <span style='color:#00BFC4'>**2024**</span>", "NFL coaches<br>in <span style='color:#F8766D'>**2014**</span>"),
x = c(6, 8.2),
y = c(80, 37),
angle = c(10, 10),
color = c("black", "black")
)
nfl_4thdown %>%
filter(vegas_wp > .2, between(go_boost, -10, 10), season %in% c(2014, 2024)) %>%
ggplot(aes(go_boost, go, color = as.factor(season))) +
ggtext::geom_richtext(
data = text_df,
aes(x, y, label = label, angle = angle),
color = "black",
fill = NA,
label.color = NA,
size = 5) +
geom_vline(xintercept = 0) +
stat_smooth(method = "gam", method.args = list(gamma = 1), formula = y ~ s(x, bs = "cs", k = 10), show.legend = F, se = F, linewidth = 4) +
# this is just to get the plot to draw the full 0 to 100 range
geom_hline(yintercept = 100, alpha = 0) +
geom_hline(yintercept = 0, alpha = 0) +
ggplot2::theme_classic() +
labs(x = "Gain in win probability by going for it",
y = "Go-for-it percentage",
subtitle = "4th down decisions in 2024 versus 2014, win prob. > 20%",
title = glue::glue("How <span style='color:red'>math</span> is changing football")) +
theme(
legend.position = "none",
plot.title = element_markdown(size = 22, hjust = 0.5),
plot.subtitle = element_markdown(size = 14, hjust = 0.5),
axis.title.x = element_text(size = 14, face = "bold"),
axis.title.y = element_text(size = 14, face = "bold")
) +
scale_y_continuous(breaks = scales::pretty_breaks(n = 4), limits = c(0, 100), expand = c(0,0)) +
scale_x_continuous(breaks = scales::pretty_breaks(n = 10), limits = c(-10, 10), expand = c(0,0)) +
annotate("text", x= -1.2, y= 70, label = "Should\nkick", color="black", size = 5) +
annotate("text", x= 1.2, y= 70, label = "Should\ngo for it", color="black", size = 5) +
geom_segment(
aes(x = -.1, y = 80, xend = -2, yend = 80),
arrow = arrow(length = unit(0.05, "npc")),
color = "black", linewidth = 2
) +
geom_segment(
aes(x = .1, y = 80, xend = 2, yend = 80),
arrow = arrow(length = unit(0.05, "npc")),
color = "black", linewidth = 2
)
```
## Conclusion {#sec-sportsCognitivePsychologyConclusion}
::: {.content-visible when-format="html"}
## Session Info {#sec-sportsCognitivePsychologySessionInfo}
```{r}
sessionInfo()
```
:::