Developing a custom theme in `ggplot2`

While it’s hard to beat the ease and expressiveness of ggplot2 for most of my visualization needs, I haven’t always been terribly happy with how it’s default settings look. The process of creating a presentation quality plot usually involves toying around with the 80+ arguments of theme() and trying to remember what they control and what kind of element_ they are. It’s easy enough to remember how to change an axis.title, but how often do I need to change the legend.box.margin? How about the strip.switch.pad.grid?

After giving it some thought, I realized it would be a good idea to write up a quick list of default settings to reference whenever I wanted to quickly improve the look of a plot. I went for form over function in order to counterbalance the effective yet quite ugly ggplot2 defaults.

Here’s a demonstration of what I came up with:

iris %>% 
  ggplot(aes(Sepal.Length, Sepal.Width, color = Species)) + 
  geom_point() + 
  labs(
    title = "Before"
  )

iris %>% 
  ggplot(aes(Sepal.Length, Sepal.Width, color = Species)) + 
  geom_point() + 
  labs(
    title = "After"
  ) + 
  custom_theme + 
  scale_color_manual(values = custom_palette)

As you can see, I went for a darker design — I find the deep navy to be easier on the eye than white and grey, and I like how it makes the data pop. I changed the default font to a blue-grey Franklin Gothic face, inspired by the text that typically annotates plots on The Upshot. I created the custom color palette by hand, with some inspiration from coloors.com.

You’ll also notice that, true to my form over function philosophy, I did away with gridlines. This was a purely aesthetic, non-scientific decision, and in any case I can always bring them back with a quick call to theme().

Here’s the exact specification in case you’re interested:

custom_theme <- 
  theme(
    axis.title = element_text(color = "#8B9BA8"),
    axis.ticks = element_line(color = "#8B9BA8"),
    axis.text = element_text(color = "#8B9BA8"),
    
    legend.background = element_rect(fill = "#0A081C", color = "#8B9BA8", size = .1),
    legend.text = element_text(color = "#8B9BA8"),
    legend.title = element_text(color = "#8B9BA8"),
    legend.key = element_rect(fill = "#0A081C", size = 0),
    
    panel.grid = element_blank(),
    panel.background = element_rect(fill = "#0A081C", size = 0),
    panel.border = element_blank(),
    
    plot.background = element_rect(fill = "#0A081C", size = 0),
    plot.title = element_text(color = "#8B9BA8", face = "bold"),
    plot.subtitle = element_text(color = "#8B9BA8"),
    plot.caption = element_text(color = "#8B9BA8", face = "italic"),
    
    strip.background = element_rect(fill = "#0A081C"),
    strip.text = element_text(color = "#8B9BA8", face = "bold"),
    
    text = element_text(family = "FranklinGothic-Book")
  )

text_color <- "#8B9BA8" # bluish grey

custom_palette <- 
  c(
    "#F9C80E", # gold
    "#F86624", # orange
    "#52A4D3", # sky blue
    "#21D19F", # mint
    "#EA3546", # red
    "#FF938C", # pink
    "#58267F", # purple
    "#F3FFBD"  # light yellow
  )

Since setting this all up, I’ve been much happier with the way my plots are looking. While even this custom theme is not always exactly what I’m looking for, it gives me a starting point that’s often much closer to what I want than the original ggplot2 default. After prototyping a plot with the default aesthetics, I just add custom_theme and voilà!

To close things out, here are some pretty plots I made for a personal project where I’ve been tracking my sleep habits:

set.seed(94305)
health %>% 
  filter(!is.na(bedtime)) %>% 
  mutate(wake_time = lead(wake_time)) %>% 
  filter(!is.na(wake_time)) %>% 
  mutate(
    noise = rnorm(sum(!is.na(health$bedtime)) - 1, 0, 1 / 60),
    bedtime_hour = hour(bedtime) + (minute(bedtime)) / 60 + noise,
    wake_time_hour = hour(wake_time) + (minute(wake_time)) / 60 + noise,
    bedtime_hour = if_else(bedtime_hour > 12, bedtime_hour - 24, bedtime_hour - 0),
    sleep_length = wake_time_hour - bedtime_hour
  ) %>% 
  ggplot(aes(date, wake_time_hour)) + 
  geom_segment(
    aes(yend = bedtime_hour, xend = date, color = sleep_length),
    show.legend = FALSE
  ) + 
  geom_point(color = custom_palette[1], size = .7) + 
  geom_point(aes(y = bedtime_hour), color = custom_palette[8], size = .7) +
  scale_y_continuous(
    breaks = c(-2.5, 0, 2.5, 5, 7.5, 10), 
    labels = c("9:30 PM", "12:00 AM", "2:30 AM", "5:00 AM", "7:30 AM", "10:00 AM")
  ) +
  scale_color_gradient(high = "#8B9BA8", low = "#0A081C") +
  coord_flip() + 
  labs(
    x = NULL,
    y = NULL,
    title = paste0("Sleep from ", min(health$date), " to ", max(health$date))
  ) +
  custom_theme

avg_time_labels <- 
  tribble(
    ~ date, ~ avg, ~ label,
    max(health$date) - 6, 8, "Average wake time",
    max(health$date) - 6, .8, "Average bedtime"
  )

health %>% 
  filter(!is.na(bedtime)) %>% 
  mutate(wake_time = lead(wake_time)) %>% 
  filter(!is.na(wake_time)) %>% 
  mutate(
    noise = rnorm(sum(!is.na(health$bedtime)) - 1, 0, 1 / 60),
    bedtime_hour = hour(bedtime) + (minute(bedtime)) / 60 + noise,
    wake_time_hour = hour(wake_time) + (minute(wake_time)) / 60 + noise,
    bedtime_hour = if_else(bedtime_hour > 12, bedtime_hour - 24, bedtime_hour - 0),
    sleep_length = wake_time_hour - bedtime_hour,
    avg_bedtime = cummean(bedtime_hour),
    avg_wake_time = cummean(wake_time_hour)
  ) %>% 
  gather(key = wake_bed, val = avg, avg_bedtime, avg_wake_time) %>% 
  ggplot(aes(date, avg, color = wake_bed)) + 
  geom_line(show.legend = FALSE) +
  geom_text(data = avg_time_labels, aes(label = label), color = "#8B9BA8") + 
  scale_y_continuous(
    breaks = c(-2.5, 0, 2.5, 5, 7.5, 10), 
    labels = c("9:30 PM", "12:00 AM", "2:30 AM", "5:00 AM", "7:30 AM", "10:00 AM")
  ) + 
  scale_color_manual(values = custom_palette[2:3]) + 
  custom_theme + 
  labs(
    x = NULL, 
    y = NULL,
    title = paste0("Sleep from ", min(health$date), " to ", max(health$date))
  )

health %>% 
  filter(!is.na(bedtime)) %>% 
  mutate(
    sleep_date = if_else(
      as.character(bedtime) <= "06:00",
      date + 1,
      date
    ),
    wake_datetime = paste(date, wake_time) %>% ymd_hms(),
    sleep_datetime = paste(sleep_date, bedtime) %>% ymd_hms(),
    sleep_time = wake_datetime - lag(sleep_datetime)
  ) %>% 
  filter(!is.na(sleep_time)) %>% 
  ggplot(aes(date, sleep_time)) +
  geom_col(fill = custom_palette[1]) +
  geom_hline(
    aes(yintercept = mean(sleep_time)), 
    color = custom_palette[5], 
    lty = 2
  ) +
  coord_flip() + 
  custom_theme +
  labs(
    x = NULL,
    y = "Hours of sleep",
    title = paste0("Sleep from ", min(health$date), " to ", max(health$date))
  ) 
## Don't know how to automatically pick scale for object of type difftime. Defaulting to continuous.