Coordinate systems and axes

Claus O. Wilke


Most data visualizations use Cartesian coordinates


Changing units does not change the plot



If scale units are unrelated, aspect ratio is arbitrary


Non-linear scales and coordinate systems

Logarithmic scales (log scales)

Visualize these five values: 1,   3.16,   10,   31.6,   100




Example: Population number of Texas counties

A linear scale emphasizes large counties


Example: Population number of Texas counties

A log scale shows symmetry around the median


Nonlinear coordinate systems: Polar coordinates



Cartesian vs polar example



Scales and coordinate systems in ggplot2

Getting the data

The boxoffice dataset:

boxoffice <- tibble(
  rank = 1:5,
  title = c("Star Wars", "Jumanji", "Pitch Perfect 3", "Greatest Showman", "Ferdinand"),
  amount = c(71.57, 36.17, 19.93, 8.81, 7.32) # million USD

The tx_counties dataset:

tx_counties <- read_csv("") |> 
  filter(state == "Texas") |>
  mutate(popratio = pop2010/median(pop2010)) |>
  arrange(desc(popratio)) |>
  mutate(index = 1:n())

Getting the data

The temperatures and temps_wide datasets (long and wide format of the same data):

# long format
temperatures <- read_csv("") |>
    location = factor(
      location, levels = c("Death Valley", "Houston", "San Diego", "Chicago")
  ) |>
  select(location, station_id, day_of_year, month, temperature)

# wide format
temps_wide <- temperatures |>
    id_cols = c("month", "day_of_year"),
    names_from = "location", values_from = "temperature"

Scale functions customize the x and y axes

Recall the box-office example from a prior lecture:

ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +

Scale functions customize the x and y axes

Add scale functions (no change in figure so far):

ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
  scale_x_continuous() +

Scale functions customize the x and y axes

The parameter name sets the axis title:

ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
    name = "weekend gross (million USD)"
  ) +
    name = NULL  # no axis title

Note: We could do the same with xlab() and ylab()

Scale functions customize the x and y axes

The parameter limits sets the scale limits:

ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
    name = "weekend gross (million USD)",
    limits = c(0, 80)
  ) +
    name = NULL

Note: We could do the same with xlim() and ylim() but I advise against it, as these functions can have unexpected side-effects

Scale functions customize the x and y axes

The parameter breaks sets the axis tick positions:

ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
    name = "weekend gross (million USD)",
    limits = c(0, 80),
    breaks = c(0, 25, 50, 75)
  ) +
    name = NULL

Scale functions customize the x and y axes

The parameter labels sets the axis tick labels:

ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
    name = "weekend gross (million USD)",
    limits = c(0, 80),
    breaks = c(0, 25, 50, 75),
    labels = c("0", "$25M", "$50M", "$75M")
  ) +
    name = NULL

Scale functions customize the x and y axes

The parameter expand sets the axis expansion:

ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
    name = "weekend gross (million USD)",
    limits = c(0, 80),
    breaks = c(0, 25, 50, 75),
    labels = c("0", "$25M", "$50M", "$75M"),
    expand = expansion(mult = c(0, 0.06))
  ) +
    name = NULL

Scale functions define transformations

Linear y scale:

ggplot(tx_counties) +
  aes(x = index, y = popratio) +
  geom_point() +


Log y scale:

ggplot(tx_counties) +
  aes(x = index, y = popratio) +
  geom_point() +


Parameters work the same for all scale functions

Linear y scale:

ggplot(tx_counties) +
  aes(x = index, y = popratio) +
  geom_point() +
    name = "population number / median",
    breaks = c(0, 100, 200),
    labels = c("0", "100", "200")


Log y scale:

ggplot(tx_counties) +
  aes(x = index, y = popratio) +
  geom_point() +
    name = "population number / median",
    breaks = c(0.01, 1, 100),
    labels = c("0.01", "1", "100")


Coords define the coordinate system

ggplot(temperatures, aes(day_of_year, temperature, color = location)) +
  geom_line() +
  coord_cartesian()  # cartesian coords are the default


Coords define the coordinate system

ggplot(temperatures, aes(day_of_year, temperature, color = location)) +
  geom_line() +
  coord_polar()   # polar coords


Coords define the coordinate system

ggplot(temperatures, aes(day_of_year, temperature, color = location)) +
  geom_line() +
  coord_polar() +
  scale_y_continuous(limits = c(0, 105))  # fix up temperature limits


Use coord_fixed() for fixed aspect ratio

ggplot(temps_wide, aes(`San Diego`, Houston)) +


(Bad, x and y axis show the same values scaled differently)

Use coord_fixed() for fixed aspect ratio

ggplot(temps_wide, aes(`San Diego`, Houston)) +
  geom_path() +


(Better, x and y axis are now scaled the same)

Use coord_fixed() for fixed aspect ratio

ggplot(temps_wide, aes(`San Diego`, Houston)) +
  geom_path() +
  coord_fixed() +
  scale_x_continuous(breaks = c(50, 60, 70), limits = c(50, 75))


(Even better, similar axis ticks along both axes)

Further reading