```{r global_options, include=FALSE} library(knitr) opts_chunk$set(fig.align = "center", fig.height = 3, fig.width = 4) library(tidyverse) theme_set(theme_bw(base_size = 12)) library(grid) # for `arrow()` ``` ## Lab Worksheet 5 In 1898, Hermon Bumpus, an American biologist working at Brown University, collected data on one of the first examples of natural selection directly observed in nature. Immediately following a bad winter storm, he collected 136 English house sparrows, *Passer domesticus*, and brought them indoors. Of these birds, 64 had died during the storm, but 72 recovered and survived. By comparing measurements of physical traits, Bumpus demonstrated physical differences between the dead and living birds. He interpreted this finding as evidence for natural selection as a result of this storm: ```{r} bumpus <- read_csv("http://wilkelab.org/classes/SDS348/data_sets/bumpus_full.csv") head(bumpus) ``` The data set has three categorical variables (`Sex`, with levels `Male` and `Female`, `Age`, with levels `Adult` and `Young`, and `Survival`, with levels `Alive` and `Dead`) and nine numerical variables that hold various aspects of the birds' anatomy, such as wingspread, weight, etc. **Question 1: Perform a PCA on the numerical columns of this data set. Then make three plots potting the data as PC2 vs. PC1, colored by (i) sex, (ii) age, (iii) survival.** ```{r} # Your R code here ``` **Question 2: Now visualize the rotation matrix of the PCA obtained under Question 1.** **Hint:** Look at yesterday's in class worksheet to find the code for visualizing the rotation matrix with arrows. ```{r, fig.height=10, fig.width=10} # Your R code here ``` **Question 3: Given the four plots from Questions 1 and 2, how do you interpret PC1 and PC2? What does PC1 tell you about a data point? What does PC2 tell you about a data point?** *Your answer here.* **Question 4: What percentage of the variation in the data does PC1 explain?** **Hint:** Look at yesterday's in class worksheet to find the calculation for relative variance of each principal component. ```{r} # Your R code here ``` *Your answer here.* **Question 5: Does the PCA suggest any specific physical characteristics for birds that survived? Consider only PC1 and PC2 for your answer.** *Your answer here.*