ggplot(penguins) +
aes(y = fct_relevel(species, "Chinstrap", "Gentoo", "Adelie")) +
geom_bar() + ylab(NULL)
2025-02-17
penguins |>
mutate(species = fct_relevel(species, "Chinstrap", "Gentoo", "Adelie")) |>
slice(1:30) |> # get first 30 rows
pull(species) # pull out just the `species` column
[1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[11] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[21] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
Levels: Chinstrap Gentoo Adelie
species
is a factorpenguins |>
mutate(species = fct_relevel(species, "Chinstrap", "Gentoo", "Adelie")) |>
slice(1:30) |> # get first 30 rows
pull(species) # pull out just the `species` column
[1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[11] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[21] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
Levels: Chinstrap Gentoo Adelie
The order of factor levels is independent of the order of values in the table:
penguins |>
mutate(species = fct_relevel(species, "Gentoo", "Adelie", "Chinstrap")) |>
slice(1:30) |> pull(species)
[1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[11] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[21] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
Levels: Gentoo Adelie Chinstrap
fct_relevel()
[1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[11] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[21] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
Levels: Adelie Chinstrap Gentoo
Default: alphabetic order
fct_relevel()
[1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[11] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[21] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
Levels: Gentoo Adelie Chinstrap
Move "Gentoo"
in front, rest alphabetic
fct_relevel()
penguins |>
mutate(species = fct_relevel(species, "Chinstrap", "Gentoo")) |>
slice(1:30) |> pull(species)
[1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[11] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[21] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
Levels: Chinstrap Gentoo Adelie
Move "Chinstrap"
in front, then "Gentoo"
, rest alphabetic
fct_relevel()
penguins |>
mutate(species = fct_relevel(species, "Chinstrap", "Adelie", "Gentoo")) |>
slice(1:30) |> pull(species)
[1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[11] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[21] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
Levels: Chinstrap Adelie Gentoo
Use order "Chinstrap"
, "Adelie"
, "Gentoo"
fct_infreq()
[1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[11] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
[21] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
Levels: Adelie Gentoo Chinstrap
fct_infreq()
fct_rev()
fct_reorder()
The order is ascending, from smallest to largest value
fct_reorder()
# A tibble: 344 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
<fct> <fct> <dbl> <dbl> <int> <int>
1 Adelie Torgersen 39.1 18.7 181 3750
2 Adelie Torgersen 39.5 17.4 186 3800
3 Adelie Torgersen 40.3 18 195 3250
4 Adelie Torgersen NA NA NA NA
5 Adelie Torgersen 36.7 19.3 193 3450
6 Adelie Torgersen 39.3 20.6 190 3650
7 Adelie Torgersen 38.9 17.8 181 3625
8 Adelie Torgersen 39.2 19.6 195 4675
9 Adelie Torgersen 34.1 18.1 193 3475
10 Adelie Torgersen 42 20.2 190 4250
# ℹ 334 more rows
# ℹ 2 more variables: sex <fct>, year <int>
# A tibble: 1,704 × 6
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
Reminder: Default order is alphabetic, from bottom to top
Order is ascending from bottom to top
fct_reorder()
and see what happensfct_reorder()
applies a summary functionmedian()
fct_reorder()
applies a summary functionmedian()
Dataset: Flights out of New York City in 2013
library(nycflights13)
flight_data <- flights |> # take data on individual flights
left_join(airlines) |> # add in full-length airline names
select(name, carrier, flight, year, month, day, origin, dest) # pick columns of interest
flight_data
# A tibble: 336,776 × 8
name carrier flight year month day origin dest
<chr> <chr> <int> <int> <int> <int> <chr> <chr>
1 United Air Lines Inc. UA 1545 2013 1 1 EWR IAH
2 United Air Lines Inc. UA 1714 2013 1 1 LGA IAH
3 American Airlines Inc. AA 1141 2013 1 1 JFK MIA
4 JetBlue Airways B6 725 2013 1 1 JFK BQN
5 Delta Air Lines Inc. DL 461 2013 1 1 LGA ATL
6 United Air Lines Inc. UA 1696 2013 1 1 EWR ORD
7 JetBlue Airways B6 507 2013 1 1 EWR FLL
8 ExpressJet Airlines Inc. EV 5708 2013 1 1 LGA IAD
9 JetBlue Airways B6 79 2013 1 1 JFK MCO
10 American Airlines Inc. AA 301 2013 1 1 LGA ORD
# ℹ 336,766 more rows
As (almost) always, the default alphabetic ordering is terrible
Ordering by frequency is better, but do we need to show all airlines?
Now the ordering is again alphabetic…
In most cases, you will want to order before lumping
Can we visually separate the “Other” category?
One annoying issue: The legend is in the wrong order
flight_data |>
mutate(
name = fct_lump_n(fct_infreq(name), 7),
# Use `fct_other()` to manually lump all
# levels not called "Other" into "Named"
highlight = fct_other(
name,
keep = "Other", other_level = "Named"
)
) |>
ggplot() +
aes(
y = fct_rev(name),
# reverse fill aesthetic
fill = fct_rev(highlight)
) +
geom_bar()
flight_data |>
mutate(
name = fct_lump_n(fct_infreq(name), 7),
highlight = fct_other(
name, keep = "Other", other_level = "Named"
)
) |>
ggplot() +
aes(y = fct_rev(name), fill = highlight) +
geom_bar() +
scale_x_continuous(
name = "Number of flights",
expand = expansion(mult = c(0, 0.07))
) +
scale_y_discrete(name = NULL) +
scale_fill_manual(
values = c(
Named = "gray50", Other = "#98545F"
),
guide = "none"
) +
theme_minimal_vgrid()
Function | Use case | Documentation |
---|---|---|
fct_relevel() |
Change order of factor levels manually | click here |
fct_infreq() |
Put levels in descending order of how frequently each level occurs in the data | click here |
fct_rev() |
Reverse the order of factor levels | click here |
fct_reorder() |
Put levels in ascending order determined by a numeric variable or summary function | click here |
fct_lump_n() |
Retain the n most frequent levels and lump all others into "Other" |
click here |
fct_other() |
Manually group some factor levels into "Other" |
click here |
For more options, check out the reference documentation of the forcats package