This worksheet uses the `iris`

data set available in R. This data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are *Iris setosa*, *versicolor*, and *virginica:*

`head(iris)`

```
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
```

**Problem 1:** Is there a difference in sepal length between species setosa and species virginica? Perform a t test and discuss your results (1-2 sentences).

`t.test(iris$Sepal.Length[iris$Species == "setosa"], iris$Sepal.Length[iris$Species == "virginica"])`

```
##
## Welch Two Sample t-test
##
## data: iris$Sepal.Length[iris$Species == "setosa"] and iris$Sepal.Length[iris$Species == "virginica"]
## t = -15.386, df = 76.516, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.78676 -1.37724
## sample estimates:
## mean of x mean of y
## 5.006 6.588
```

There is a significant difference in sepal length. Sepals of species *virginica* are on average 1.6 cm longer than sepals of species *setosa.*

**Problem 2:** Make side-by-side box plots of sepal length for the three species. Discuss what patterns you observe (1-2 sentences).

`boxplot(iris$Sepal.Length ~ iris$Species, ylab = "Sepal Length (cm)")`

Sepal length seems to increase from *setosa* to *versicolor* to *virginica.*

**Problem 3:** Make a scatter plot of sepal length vs.Â petal length for the three species. Make a single plot that shows the data for all three species at once, in different colors. Hint: To see all data in one plot, you will have to manually set the plot limits, using the `xlim`

and `ylim`

parameters of the `plot`

function. Discuss your results (1-2 sentences).

```
setosa <- iris[iris$Species == "setosa", ]
versicolor <- iris[iris$Species == "versicolor", ]
virginica <- iris[iris$Species == "virginica", ]
plot(setosa$Sepal.Length, setosa$Petal.Length, pch = 19, col = "blue", xlim = c(3, 8), ylim = c(1, 8), xlab = "Sepal Length (cm)", ylab = "Petal Length (cm)")
points(versicolor$Sepal.Length, versicolor$Petal.Length, pch = 19, col = "red")
points(virginica$Sepal.Length, virginica$Petal.Length, pch = 19, col = "green")
```

*Setosa* is plotted in blue, *versicolor* in red, and *virginica* in green. Both *versicolor* and *virginica* have much longer petals than *setosa*, but only somewhat longer sepals.