class: center, middle, title-slide .title[ # Coordinate systems and axes ] .author[ ### Claus O. Wilke ] .date[ ### last updated: 2023-01-19 ] --- ## Most data visualizations use Cartesian coordinates .center[ <!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Changing units does not change the plot .pull-left[ <!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) -- .pull-right[ <!-- --> ] --- ## If scale units are unrelated, aspect ratio is arbitrary .center[ <!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- class: center middle ## Non-linear scales and coordinate systems --- ## Logarithmic scales (log scales) .small-text[ Visualize these five values: 1, 3.16, 10, 31.6, 100 ] -- .center.nogap[ <!-- --> ] -- .center.nogap[ <!-- --> ] -- .center.nogap[ <!-- --> ] --- ## Example: Population number of Texas counties A linear scale emphasizes large counties .center.nogap[ <!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Example: Population number of Texas counties A log scale shows symmetry around the median .center.nogap[ <!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Nonlinear coordinate systems: Polar coordinates .pull-left[ <!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) -- .pull-right[ <!-- --> ] --- ## Cartesian vs polar example .pull-left[ <!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) -- .pull-right[ <!-- --> ] [//]: # "segment ends here" --- class: center middle ## Scales and coordinate systems in **ggplot2** --- ## Getting the data The `boxoffice` dataset: .tiny-font[ ```r boxoffice <- tibble( rank = 1:5, title = c("Star Wars", "Jumanji", "Pitch Perfect 3", "Greatest Showman", "Ferdinand"), amount = c(71.57, 36.17, 19.93, 8.81, 7.32) # million USD ) ``` ] -- The `tx_counties` dataset: .tiny-font[ ```r tx_counties <- read_csv("https://wilkelab.org/DSC385/datasets/US_census.csv") %>% filter(state == "Texas") %>% mutate(popratio = pop2010/median(pop2010)) %>% arrange(desc(popratio)) %>% mutate(index = 1:n()) ``` ] --- ## Getting the data The `temperatures` dataset: .tiny-font[ ```r temperatures <- read_csv("https://wilkelab.org/DSC385/datasets/tempnormals.csv") %>% mutate( location = factor( location, levels = c("Death Valley", "Houston", "San Diego", "Chicago") ) ) %>% select(location, station_id, day_of_year, month, temperature) ``` ] --- ## Scale functions customize the x and y axes Recall the box-office example from a prior lecture .pull-left.tiny-font[ ```r ggplot(boxoffice) + aes(amount, fct_reorder(title, amount)) + geom_col() ``` ] -- .pull-right[ <!-- --> ] --- ## Scale functions customize the x and y axes Add scale functions (no change in figure so far) .pull-left.tiny-font[ ```r ggplot(boxoffice) + aes(amount, fct_reorder(title, amount)) + geom_col() + * scale_x_continuous() + * scale_y_discrete() ``` ] .pull-right[ <!-- --> ] --- ## Scale functions customize the x and y axes The parameter `name` sets the axis title .pull-left.tiny-font[ ```r ggplot(boxoffice) + aes(amount, fct_reorder(title, amount)) + geom_col() + scale_x_continuous( * name = "weekend gross (million USD)" ) + scale_y_discrete( * name = NULL # no axis title ) ``` ] .pull-right[ <!-- --> ] Note: We could do the same with `xlab()` and `ylab()` --- ## Scale functions customize the x and y axes The parameter `limits` sets the scale limits .pull-left.tiny-font[ ```r ggplot(boxoffice) + aes(amount, fct_reorder(title, amount)) + geom_col() + scale_x_continuous( name = "weekend gross (million USD)", * limits = c(0, 80) ) + scale_y_discrete( name = NULL ) ``` ] .pull-right[ <!-- --> ] Note: We could do the same with `xlim()` and `ylim()` --- ## Scale functions customize the x and y axes The parameter `breaks` sets the axis tick positions .pull-left.tiny-font[ ```r ggplot(boxoffice) + aes(amount, fct_reorder(title, amount)) + geom_col() + scale_x_continuous( name = "weekend gross (million USD)", limits = c(0, 80), * breaks = c(0, 25, 50, 75) ) + scale_y_discrete( name = NULL ) ``` ] .pull-right[ <!-- --> ] --- ## Scale functions customize the x and y axes The parameter `labels` sets the axis tick labels .pull-left.tiny-font[ ```r ggplot(boxoffice) + aes(amount, fct_reorder(title, amount)) + geom_col() + scale_x_continuous( name = "weekend gross", limits = c(0, 80), breaks = c(0, 25, 50, 75), * labels = c("0", "$25M", "$50M", "$75M") ) + scale_y_discrete( name = NULL ) ``` ] .pull-right[ <!-- --> ] --- ## Scale functions customize the x and y axes The parameter `expand` sets the axis expansion .pull-left.tiny-font[ ```r ggplot(boxoffice) + aes(amount, fct_reorder(title, amount)) + geom_col() + scale_x_continuous( name = "weekend gross", limits = c(0, 80), breaks = c(0, 25, 50, 75), labels = c("0", "$25M", "$50M", "$75M"), * expand = expansion(mult = c(0, 0.06)) ) + scale_y_discrete( name = NULL ) ``` ] .pull-right[ <!-- --> ] --- ## Scale functions define transformations .pull-left.nogap[ Linear y scale: .tiny-font[ ```r ggplot(tx_counties) + aes(x = index, y = popratio) + geom_point() + * scale_y_continuous() ``` <!-- --> ]] -- .pull-right.nogap[ Log y scale: .tiny-font[ ```r ggplot(tx_counties) + aes(x = index, y = popratio) + geom_point() + * scale_y_log10() ``` <!-- --> ]] --- ## Parameters work the same for all scale functions .pull-left.nogap.tiny-font[ ```r ggplot(tx_counties) + aes(x = index, y = popratio) + geom_point() + scale_y_continuous( name = "population number / median", breaks = c(0, 100, 200), labels = c("0", "100", "200") ) ``` <!-- --> ] .pull-right.nogap.tiny-font[ ```r ggplot(tx_counties) + aes(x = index, y = popratio) + geom_point() + scale_y_log10( name = "population number / median", breaks = c(0.01, 1, 100), labels = c("0.01", "1", "100") ) ``` <!-- --> ] --- ## Coords define the coordinate system .nogap.tiny-font[ ```r ggplot(temperatures, aes(day_of_year, temperature, color = location)) + geom_line() + * coord_cartesian() # cartesian coords are the default ``` .center[ <!-- --> ]] --- ## Coords define the coordinate system .nogap.tiny-font[ ```r ggplot(temperatures, aes(day_of_year, temperature, color = location)) + geom_line() + * coord_polar() # polar coords ``` .center[ <!-- --> ]] --- ## Coords define the coordinate system .nogap.tiny-font[ ```r ggplot(temperatures, aes(day_of_year, temperature, color = location)) + geom_line() + coord_polar() + * scale_y_continuous(limits = c(0, 105)) # fix up temperature limits ``` .center[ <!-- --> ]] [//]: # "segment ends here" --- ## Further reading - Fundamentals of Data Visualization: [Chapter 3: Coordinate systems and axes](https://clauswilke.com/dataviz/coordinate-systems-axes.html) - **ggplot2** reference documentation: [Scales](https://ggplot2.tidyverse.org/reference/index.html#section-scales) - **ggplot2** reference documentation: [Coordinate systems](https://ggplot2.tidyverse.org/reference/index.html#section-coordinate-systems) - **ggplot2** book: [Position scales](https://ggplot2-book.org/scale-position.html) - **ggplot2** book: [Coordinate systems](https://ggplot2-book.org/coord.html)