This homework is due on Feb. 1, 2024 at 11:00pm. Please submit as a pdf file on Canvas.

Problem 1: (6 pts) For this Problem you will be working with the iris dataset built into R. This data set contains measurements of flowers (sepal length, sepal width, petal length, petal width) for three different Iris species (I. setosa, I. versicolor, I. virginica).

head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Use ggplot to make a histogram of the Sepal.Length column. Manually choose appropriate values for binwidth and center. Explain your choice of values in 2-3 sentences.

# Your code goes here.

Problem 2: (6 pts) For this problem you will work with the dataset txhouse that has been derived from the txhousing dataset provided by ggplot2. See here for details of the original dataset: https://ggplot2.tidyverse.org/reference/txhousing.html. txhouse contains three columns: city (listing four Texas cities), year (containing four years between 2000 and 2015) and total_sales indicating the total number of sales for the specified year and city.

txhouse
## # A tibble: 16 × 3
## # Groups:   city [4]
##    city         year total_sales
##    <chr>       <int>       <dbl>
##  1 Austin       2000       18621
##  2 Austin       2005       26905
##  3 Austin       2010       19872
##  4 Austin       2015       18878
##  5 Dallas       2000       45446
##  6 Dallas       2005       59980
##  7 Dallas       2010       42383
##  8 Dallas       2015       36735
##  9 Houston      2000       52459
## 10 Houston      2005       72800
## 11 Houston      2010       56807
## 12 Houston      2015       48109
## 13 San Antonio  2000       15590
## 14 San Antonio  2005       24034
## 15 San Antonio  2010       18449
## 16 San Antonio  2015       16455

Use ggplot to make a bar plot of the total housing sales (column total_sales) for each year, color the bar borders “gray34”, and fill the bars by city.

# Your code goes here.

Problem 3: (8 pts) Modify the plot from Problem 2 by placing city bars side-by-side, rather than stacked. See Slide 35 from the lecture on visualizing amounts. Next, reorder the bars for each year by total_sales in descending order. See Slide 25 from the lecture on visualizing amounts.

# Your code goes here.