This is the dataset you will be working with:
olympics <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-07-27/olympics.csv')
triathlon <- olympics %>%
filter(!is.na(height)) %>% # only keep athletes with known height
filter(sport == "Triathlon") %>% # keep only triathletes
mutate(
medalist = case_when( # add column to track medalist vs not
is.na(medal) ~ "non-medalist",
!is.na(medal) ~ "medalist" # any medals (Gold, Silver, Bronze) count
)
)
triathlon
is a subset of olympics
and
contains only the data for triathletes. More information about the
original olympics
dataset can be found at https://github.com/rfordatascience/tidytuesday/tree/master/data/2021/2021-07-27/readme.md
and https://www.sports-reference.com/olympics.html.
For this project, use triathlon
to answer the following
questions about athletes competing in this sport:
You should make one plot per question.
Hints:
year
into a
factor.fig.width
and fig.height
in the
chunk headers to customize figure sizing and figure aspect ratios.You can delete these instructions from your project. Please also
delete text such as Your approach here or
# Q1: Your R code here
.
Introduction: Your introduction here.
Approach: Your approach here.
Analysis:
# Q1: Your R code here
# Q2: Your R code here
# Q3: Your R code here
Discussion: Your discussion of results here.