Enter your name and EID here

This homework is due on April 26, 2021 at 11:00pm. Please submit as a pdf file on Canvas.

Problem 1: (2 pts)

Use the color picker app from the colorspace package (`colorspace::choose_color()`) to create a qualitative color scale containing four colors. One of the four colors should be `#5626B4`, so you need to find three additional colors that go with this one.

``````colors <- c("#5626B4", "#A12B37", "#3E7732", "#C38C29")

swatchplot(colors)``````

Problem 2: (4 pts) Take the following scatter plot of the penguins dataset and make three modifications:

1. Use the colors you chose in Problem 1.
2. Improve the visual appearance by choosing a theme and cleaning up axis labels.
3. Remove the need for a legend by direct-labeling the points.
``````labels_data <- tibble(
species = c("Adelie", "Chinstrap", "Gentoo"),
bill_length_mm = c(35, 53, 45),
body_mass_g = c(4000, 3300, 5500),
hjust = c(1, 0, 1)
)

ggplot(penguins, aes(bill_length_mm, body_mass_g, color = species)) +
geom_point(size = 2, na.rm = TRUE) +
geom_text(
data = labels_data,
aes(label = species, hjust = hjust),
size = 14/.pt
) +
scale_x_continuous(
name = "bill length [mm]",
limits = c(30, 60)
) +
scale_y_continuous(
name = "body mass [g]"
) +
scale_color_manual(values = colors[c(1, 3, 4)], guide = "none") +
theme_minimal(14) ``````

Problem 3: (4 pts) The following scatter plot shows per-capita income versus number of inhabitants in all Texas counties in 2010. Use `geom_text_repel()` to label a subset of the counties by name. You can choose the counties to subset as you wish. Also, choose a theme and clean up the axis labeling, and make any other improvements to the plot design you consider appropriate.

Hint: If youâ€™re not sure how to select a subset of counties to label, check out the examples on the ggrepel website for some inspiration: https://ggrepel.slowkow.com/articles/examples.html#examples-1

``````tx_census <- read_csv("https://wilkelab.org/SDS375/datasets/US_census.csv") %>%
filter(state == "Texas") %>%
select(county = name, pop2010, per_capita_income)

set.seed(1234)

tx_census %>%
mutate(
# randomly label 20% as well as the most extreme caess
label = ifelse(
per_capita_income > 35000 |
pop2010 > 1e6,
county, ""
)
) %>%
ggplot(aes(pop2010, per_capita_income)) +
geom_point(size = 1.5, color = "#0072B2B0") +
geom_text_repel(
aes(label = label),
max.overlaps = Inf,
force = 5,
size = 10/.pt
) +
scale_x_log10(
name = "number of inhabitants, 2010",
limits = c(1e1, 1e9)
) +
scale_y_continuous(
name = "per-capita income",
labels = scales::dollar_format(),
limits = c(8000, 45000)
) +
theme_bw(12)``````