2025-02-23
Scale transformations are applied before statistical transformations
Housing prices follow long-term exponential growth, overlaid with boom/bust cycles
House Price Index (HPI) for California. Source: Freddie Mac
Did housing prices in California decline substantially from 1990 to 1998?
Did housing prices in West Virginia recover by 2020?
US States House Price Index (HPI). Source: Freddie Mac
Did housing prices in California decline substantially from 1990 to 1998?
Did housing prices in West Virginia recover by 2020?
US States House Price Index (HPI). Source: Freddie Mac
Did housing prices in California decline substantially from 1990 to 1998? — yes
Did housing prices in West Virginia recover by 2020? — no
US States House Price Index (HPI). Source: Freddie Mac
Two choices:
It is critical to make the correct choice for the dataset at hand
Any type of growth or decay process (change is proportional to present value) must be analyzed in log space
US States House Price Index (HPI). Source: Freddie Mac
US States House Price Index (HPI). Source: Freddie Mac
US States House Price Index (HPI). Source: Freddie Mac
CO2 abundance in the atmosphere over time. Source: NOAA Global Monitoring Laboratory
CO2 abundance in the atmosphere over time. Source: NOAA Global Monitoring Laboratory
We can use STL to decompose a time series into:
long-term trend
seasonal effect
remainder (noise)
Magnitude of remainder should be small compared to magnitude of seasonal fluctuations
Magnitude of remainder should be small compared to magnitude of seasonal fluctuations
Simpler approaches:
More complex approaches:
All of these are beyond the scope of this class
First dataset: blue_jays
# A tibble: 123 × 8
bird_id sex bill_depth_mm bill_width_mm bill_length_mm head_length_mm
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 0000-00000 M 8.26 9.21 25.9 56.6
2 1142-05901 M 8.54 8.76 25.0 56.4
3 1142-05905 M 8.39 8.78 26.1 57.3
4 1142-05907 F 7.78 9.3 23.5 53.8
5 1142-05909 M 8.71 9.84 25.5 57.3
6 1142-05911 F 7.28 9.3 22.2 52.2
7 1142-05912 M 8.74 9.28 25.4 57.1
8 1142-05914 M 8.72 9.94 30 60.7
9 1142-05917 F 8.2 9.01 22.8 52.8
10 1142-05920 F 7.67 9.31 24.6 54.9
# ℹ 113 more rows
# ℹ 2 more variables: body_mass_g <dbl>, skull_size_mm <dbl>
Second dataset: cars93
# A tibble: 93 × 27
Manufacturer Model Type Min.Price Price Max.Price MPG.city MPG.highway
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Acura Integra Small 12.9 15.9 18.8 25 31
2 Acura Legend Midsi… 29.2 33.9 38.7 18 25
3 Audi 90 Compa… 25.9 29.1 32.3 20 26
4 Audi 100 Midsi… 30.8 37.7 44.6 19 26
5 BMW 535i Midsi… 23.7 30 36.2 22 30
6 Buick Century Midsi… 14.2 15.7 17.3 22 31
7 Buick LeSabre Large 19.9 20.8 21.7 19 28
8 Buick Roadmaster Large 22.6 23.7 24.9 16 25
9 Buick Riviera Midsi… 26.3 26.3 26.3 19 27
10 Cadillac DeVille Large 33 34.7 36.3 16 25
# ℹ 83 more rows
# ℹ 19 more variables: AirBags <chr>, DriveTrain <chr>, Cylinders <chr>,
# EngineSize <dbl>, Horsepower <dbl>, RPM <dbl>, Rev.per.mile <dbl>,
# Man.trans.avail <chr>, Fuel.tank.capacity <dbl>, Passengers <dbl>,
# Length <dbl>, Wheelbase <dbl>, Width <dbl>, Turn.circle <dbl>,
# Rear.seat.room <dbl>, Luggage.room <dbl>, Weight <dbl>, Origin <chr>,
# Make <chr>
geom_smooth()
Scatter plot only
geom_smooth()
Scatter plot with loess smooth
geom_smooth()
Scatter plot with linear regression
geom_smooth()
Scatter plot with linear regression, no confidence band
geom_smooth()
Scatter plot with linear regression by sex
Do more expensive cars have a larger fuel tank?
Caution: Exact shape of smoothing line depends on method details
Caution: Exact shape of smoothing line depends on method details
Caution: Exact shape of smoothing line depends on method details
Caution: Exact shape of smoothing line depends on method details
Caution: Exact shape of smoothing line depends on method details
Caution: Exact shape of smoothing line depends on method details
Caution: Exact shape of smoothing line depends on method details
Caution: Exact shape of smoothing line depends on method details
Smoothing lines are particularly unreliable near their endpoints
geom_smooth()