```{r global_options, include=FALSE}
library(knitr)
opts_chunk$set(fig.align="center", fig.height=3, fig.width=4)
```
## In-class worksheet 8
**Feb 14, 2019**

In this worksheet, we will use the library tidyverse:
```{r message=FALSE}
library(tidyverse)
```

## 1. Making wide tables longer
Consider the following data set, which contains information about income and religious affiliation in the US:
```{r}
pew <- read_csv("http://wilkelab.org/classes/SDS348/data_sets/pew.csv")
head(pew)
```
This table is not tidy, because income levels are used as column headers rather than as levels of an `income` variable.

Use `gather()` to turn this table into a table with three columns, one for religion, one for income (called `income`), and one for the count of people with the respective combination of income and religion (called `count`).
```{r}
# R code goes here.
```

Now call the income column `income_level` and the count column `number_of_people`.
```{r}
# R code goes here.
```

Now, instead of gathering data from all columns, gather only the data from columns `below10k`, `from20to30k`, and `from50to75k`, such that your final data frame contains only these three income levels. Sort your final data frame according to `religion` and then `income_level`.
```{r}
# R code goes here.
```

## 2. Making long tables wider

Consider the following data set, which contains information about the sex, weight, and height of 200 individuals:
```{r}
persons <- read_csv("http://wilkelab.org/classes/SDS348/data_sets/persons.csv")
head(persons)
```
Is this data set tidy? And can you rearrange it so that you have one column for subject, one for sex, one for weight, and one for height?

```{r}
# R code goes here.
```

For the data set `diamonds` from the ggplot2 package, create a table displaying the mean price for each combination of cut and clarity. Then use `spread()` to rearrange this table into a wide format, such that there is a column of mean prices for each cut level (Fair, Good, Very Good, etc.).
```{r}
# R code goes here.
```


## 3. If this was easy

Take the sepal lengths from the `iris` dataset and put them into a wide table so that is one data column per species. You might be tempted to do this with the following code, which however doesn't work. Can you explain why? 
```{r}
# If you remove the # sign in the line below you will get an error; this code doesn't work
# iris %>% select(Sepal.Length, Species) %>% spread(Species, Sepal.Length)
```

*Explanation goes here.*

```{r}
# R code goes here.
```