Please use the project template R Markdown document to complete your project. The knitted R Markdown document (as a PDF) and the raw R Markdown file (as .Rmd) must be submitted to Canvas by 11:00pm on Mon., May 3, 2021. If you are using a data file (such as a .csv file) that you downloaded to your computer or to edupod, then you must submit this file to Canvas also. All documents will be graded jointly, so they must be consistent (as in, don’t change the R Markdown file without also updating the knitted document!).

All results presented must have corresponding code. Any answers/results given without the corresponding R code that generated the result will be considered absent. To be clear: if you do calculations by hand instead of using R and then report the results from the calculations, you will not receive credit for those calculations. All code reported in your final project document should work properly. Please do not include any extraneous code or code which produces error messages. (Code which produces warnings is acceptable, as long as you understand what the warnings mean.)

For this project, you will be choosing your own dataset, given the following constraints: Pick one of the datasets published by the Tidy Tuesday project between Jan 7 2020 and June 30 2020. All these datasets are available here: https://github.com/rfordatascience/tidytuesday/tree/master/data/2020

This project consists of two parts. Each part should be structured as follows:

We encourage you to be concise. A paragraph should typically not be longer than 5 sentences.

Overall Project Instructions

Specific Part 1 and 2 Instructions

In the Introduction section, write a brief introduction to the dataset, the question, and what parts of the dataset are necessary to answer the question. Imagine that your project is a standalone document and the grader has no prior knowledge of the dataset.

In the Approach section, describe what type of data wrangling you will perform and what kind of plot you will generate to address your question. Provide a clear explanation as to why this plot (e.g. boxplot, barplot, histogram, etc.) is best for providing the information you are asking about. (You can draw on the materials provided here for guidance.)

In the Analysis section, provide the code that generates your computed table and/or your plots. In your plots, use scale functions to provide nice axis labels and guides. Also, use theme functions to customize the appearance of your plots. All plots must be made with ggplot2. Do not use base R plotting functions.

In the Discussion section, interpret the results of your analysis. Identify any trends revealed (or not revealed). Speculate about why the data looks the way it does.

Important: Please do not write all your project text in italics! Italics is for emphasis only.