Jan 24, 2019
We will try the t test on the built-in data set PlantGrowth
. However, first we need to reformat the data set, which we do with the function unstack()
. We store the reformatted data set in a variable plants
:
head(PlantGrowth)
## weight group
## 1 4.17 ctrl
## 2 5.58 ctrl
## 3 5.18 ctrl
## 4 6.11 ctrl
## 5 4.50 ctrl
## 6 4.61 ctrl
plants <- unstack(PlantGrowth)
head(plants)
## ctrl trt1 trt2
## 1 4.17 4.81 6.31
## 2 5.58 4.17 5.12
## 3 5.18 4.41 5.54
## 4 6.11 3.59 5.50
## 5 4.50 5.87 5.37
## 6 4.61 3.83 5.29
The data set contains plant growth yield (dry weight) under one control and two treatment conditions:
boxplot(plants)
Question: Is the mean control weight significantly different from the mean weight under treatment 1? Is the mean weight under treatment 1 significantly different from the mean weight under treatment 2? Use the function t.test()
to find out.
# R code goes here.
We will try the correlation test on the built-in data set cars
. The data set contains the speed of cars and the distances taken to stop, measured in the 1920s:
head(cars)
## speed dist
## 1 4 2
## 2 4 10
## 3 7 4
## 4 7 22
## 5 8 16
## 6 9 10
Is there a relationship between speed and stopping distance? Use the function cor.test()
to find out. Then make a scatterplot of speed vs. stopping distance, using the function plot()
.
# R code goes here.
We will do a regression analysis on the data set cabbages
from the R package MASS. The data set contains the weight (HeadWt
), vitamin C content (VitC
), the cultivar (Cult
), and the planting date (Date
) for 60 cabbage heads:
library(MASS) # load the MASS library to make the data set available
head(cabbages)
## Cult Date HeadWt VitC
## 1 c39 d16 2.5 51
## 2 c39 d16 2.2 55
## 3 c39 d16 3.1 45
## 4 c39 d16 4.3 42
## 5 c39 d16 2.5 53
## 6 c39 d16 4.3 50
Use a multivariate regression to find out whether weight and cultivar have an effect on the vitamin C content. You will need to use the functions lm()
and summary()
.
# R code goes here.
Look into the function predict()
. Can you use it to estimate the vitamin C content of a c52 cultivar with a weight of 4? Can you use it to calculate the residuals of the regression model?
# R code goes here.