President's Day (As In: What Does President Trump Do With His Day?)

R Markdown Source

On February 3rd, Axios released President Trump’s daily schedule. As in many other areas of his political career, Trump has broken with tradition by hiding his schedule from public view.

In addition to a set of re-typed PDF files, Axios also created a Google Spreadsheet containing the president’s schedule and notes about the activities. If you’re interesting in reading about how that task could be accomplished, I highly recommend Maëlle Salmon’s post on rectangling the tables in the PDF files.

The leak and subsequent release by Axios provide unique insight into Trump’s daily activities, which are dominated by a large block of time referred to as Executive Time. Reportedly, Trump hated following a strict daily schedule, so former chief-of-staff John Kelly introduced the concept of Executive Time: unstructured time when the president reads watches news, makes phone calls, and writes emails tweets.

I won’t comment extensively on what these schedules mean—for more on that angle, see reporting from Axios, Vox, Politico and others.

Instead, I’ll use this post to visualize the president’s work day and tweeting habits and a demonstrate how to use R, plot.ly and the tools of the tidyverse to create interactive and static visualizations to try to make sense of what the president does on a daily basis.

The President’s Daily 8am to 5pm Schedule

Axios’ article on Trump’s private schedule includes an interactive view of the president’s workday schedule from 8 a.m. to 5 p.m. Here, I recreate the same visualization, with each activity colored according to the activity’s category. (Note these plots look best on desktop devices.)

Hover over any time slot to view more details about the activity at that time. You can also toggle activity categories — try removing everything except Meetings, Lunches, and Events, it’s unbelievable.

View static image of the plot. Expand the section below for a behind-the-scenes look at this visualization.

How This Was Made…

The pipeline for building this visualization is a fairly standard loading and transformation of the source data with readr and dplyr, followed by building the visualization in ggplot2 and passing off to plotly for the interactive parts. On the other hand, I created a number of helper functions and constants that I reused throughout this post, so there’s quite a bit of code and preamble to get to the actual plot making.

library(tidyverse)
library(plotly)
library(lubridate)
library(glue)
library(hrbrthemes)
library(showtext)
library(sysfonts)

Load the Axios Data

# Convert datetime to decimal hour of day
in_hours <- function(x) {
  hour(x) + minute(x)/60 + second(x)/60^2
}

executive_categories <- c(
  "executive_time" = "Executive Time",
  "event"   = "Event",
  "lunch"   = "Lunch",
  "meeting" = "Meeting",
  "no_data" = "Unknown",
  "travel"  = "Travel"
)
  
# Exec Time Downloaded from http://bit.ly/2Sk9Vj7
exec_time <-
  read_csv(
    here::here(
      "_data", "trump-exec-time",
      "axios_trump_schedule_2018-11-07--2019-02-02.csv"
    ),
    col_types = cols(.default = col_character())
  ) %>%
  mutate(
    # Convert time start/end to datetime
    event_id   = row_number(),
    time_start = paste(date, time_start),
    time_end   = paste(date, time_end),
    time_start = ymd_hm(time_start, tz = "America/New_York"),
    time_end   = ymd_hm(time_end, tz = "America/New_York"),
    time_end   = if_else(time_start > time_end, 
                         time_end + hours(24), time_end),
    # Recode the activity category with nicer labels
    top_category = factor(top_category, 
                          levels = names(executive_categories),
                          labels = executive_categories)
  ) %>% 
  mutate(
    # Create label pieces for plotly hover text
    label_title    = glue("<b>{top_category}</b>"),
    has_uniq_title = tolower(top_category) != tolower(listed_title),
    has_subtitle   = has_uniq_title & !is.na(listed_title),
    label_subtitle = if_else(has_subtitle, 
                             glue("<br><em>{listed_title}</em>"), ""),
    has_location   = !is.na(listed_location),
    label_location = if_else(has_location, 
                             glue("<br><em>{listed_location}</em>"), ""),
    label_time     = glue(
      "<br><br>{strftime(time_start, '%A, %B %e %H:%M')} ", 
      "to {strftime(time_end, '%H:%M')}"),
    has_notes      = !is.na(notes),
    label_notes    = if_else(has_notes, paste0("<br><br>", notes), ""),
    # Compose final tooltip text
    label = paste0(label_title, label_subtitle, label_location, 
                   label_time, label_notes)
  ) %>% 
  mutate(
    # truncate any activities that span 8am or 5pm
    time_start = if_else(
      in_hours(time_start) < 8 & in_hours(time_end) > 8,
      floor_date(time_start, "day") + hours(8),
      time_start
    ),
    time_end = if_else(
      in_hours(time_start) < 17 & in_hours(time_end) > 17,
      floor_date(time_end, "day") + hours(17),
      time_end
    ),
    # create 5 minute increments "inside" each activity
    time_inc   = map2(time_start, time_end, seq, by = "5 mins")
  ) %>%
  select(event_id, time_start, time_end, time_inc, 
         listed_title, top_category, label)
  
glimpse(exec_time)
## Observations: 577
## Variables: 7
## $ event_id     <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, …
## $ time_start   <dttm> 2018-11-07 08:00:00, 2018-11-07 11:00:00, 2018-11-…
## $ time_end     <dttm> 2018-11-07 11:00:00, 2018-11-07 11:30:00, 2018-11-…
## $ time_inc     <list> [<2018-11-07 08:00:00, 2018-11-07 08:05:00, 2018-1…
## $ listed_title <chr> "Executive time", "Meeting with the chief of staff"…
## $ top_category <fct> Executive Time, Meeting, Executive Time, Lunch, Exe…
## $ label        <chr> "<b>Executive Time</b><br><em>Oval office</em><br><…

Prepare to Make a Plot

To build the plot, I first created several helper functions for the plot labels, scales, and data filtering. I also set the global plot theme, and created a few constants that I used across several plots in this post.

Helper Functions

The first helper adds "am" or "pm" to an integer hour for easy-to-read labels on the x-axis time of day.

am_pm <- function(x) {
  x <- floor(x)
  y <- paste(x)
  y[x < 1] <- "12 am"
  y[x > 1 & x < 12] <- paste(y[x > 1 & x < 12], "am")
  y[x == 12] <- "12 pm"
  y[x > 12] <- paste(x[x > 12] - 12, "pm")
  y
}
  
am_pm(seq(8, 17, 3))
## [1] "8 am"  "11 am" "2 pm"  "5 pm"

The second helper function is copied directly from StackOverflow: Reverse datetime (POSIXct data) axis in ggplot, in one of the rare-but-beautiful moments when directly copying and pasting from SO works out perfectly. This gives me a rev_date() transformer function that can be passed to scale_y_continuous(trans = rev_date) to show the y-axis in chronological order with the earliest date starting at the top.

A third helper function abstracts the code required to filter out the portion of the data set that belongs to the workday, which I use throughout this post. This function is a nice example of how tidyeval can be used for flexible dplyr wrapper functions.

filter_workday <- function(
  df,  
  start = hms::hms(0, 0, 8), 
  end = hms::hms(0, 0, 17),
  start_var = time_start,
  end_var = time_end
) {
  start_var <- rlang::enquo(start_var)
  end_var <- rlang::enquo(end_var)
  
  df %>% 
    filter(
      date(!!start_var) == date(!!end_var),
      hms::as.hms(!!start_var) >= start, 
      hms::as.hms(!!end_var) <= end
    ) 
}

Finally, another helper function creates the plot caption credits.

credit_caption <- function(rtweet = FALSE) {
  paste0(
    "\nData:   Based on White House schedules released by Axios. http://bit.ly/2UGM0fw",
    if (rtweet) "\n           Tweets collected with {rtweet}. https://rtweet.info",
    "\nChart: @grrrck"
  )
}
Plot Themes

I used showtext and sysfonts to match the fonts in the plot to the font used on this blog (PT Mono).

sysfonts::font_add_google("PT Sans")
sysfonts::font_add_google("PT Sans Narrow")
sysfonts::font_add_google("PT Mono")
showtext::showtext_auto()

And I used hrbrthemes’ excellent theme_ipsum() as my theme’s starting point.

theme_set(
  hrbrthemes::theme_ipsum(
    base_family       = "PT Sans",
    base_size         = 20,
    axis_title_family = "PT Sans Narrow",
    axis_title_size   = 14,
    axis_text_size    = 12,
    axis_title_just   = "c"
  ) +
  theme(
    plot.title       = element_text(hjust = 0.5),
    plot.subtitle    = element_text(hjust = 0.5),
    plot.margin      = margin(15, 15, 15, 15),
    panel.grid.minor = element_blank(),
    panel.background = element_blank(),
    legend.position  = "bottom",
    legend.key       = element_rect(
      fill = "white", color = "white", size = 4),
    legend.key.width = unit(2, "cm"),
    legend.text      = element_text(size = 9),
    plot.caption     = element_text(
      color = "#85919b", hjust = 0, size = 12, face = "plain")
  )
)
Plot Constants

I created a few constants for later reference: one stores date breaks for the time period covered by the Axios schedule, with labels on each Monday; and the other two store the color palette and labels for the executive activity categories. The colors were hand-selected from a picture of the Donald (I was expecting there to be more orange).

plot_date_breaks <- seq(
  from = ymd_h("2018-11-12 0", tz = "America/New_York"),
  to   = max(exec_time$time_start),
  by   = "7 day")

event_type_colors <- c(
  "Executive Time" = "#445566",
  "Travel"  = "#997788",
  "Lunch"   = "#7c9393",
  "Meeting" = "#b7c6d6",
  "Event"   = "#ddbbaa",
  "Unknown" = "#d9dde0")

event_type_labels <- sub(" ", "\n", names(event_type_colors))

Build the Actual Plot Already!

Finally, I pulled the schedule data and all of the above pieces together to build the interactive plot.

g <-
  exec_time %>%
  mutate(date = floor_date(time_start, "day")) %>%
  filter_workday() %>% 
  mutate_at(vars(time_start, time_end), in_hours) %>%
  ggplot() +
  geom_rect(
    aes(xmin = time_start,
        xmax = time_end,
        ymin = date - 3600 * 12,
        ymax = date + 3600 * 12,
        fill = top_category)
  ) +
  scale_x_continuous(
    breaks   = seq(8, 17, 3),
    limits   = c(8, 17),
    position = "top",
    labels   = am_pm(seq(8, 17, 3)),
    expand   = expand_scale(c(0.025, 0), 0)
  ) +
  scale_y_continuous(
    trans  = rev_date,
    breaks = plot_date_breaks,
    labels = strftime(plot_date_breaks, "%b %e"),
    expand = expand_scale(c(0.025, 0), 0)
  ) +
  scale_fill_manual(
    values = event_type_colors, 
    labels = event_type_labels
  ) +
  labs(x = "Hour of the Day", y = NULL, fill = NULL) +
  ggtitle(
    "President Trump's Daily Schedule",
    "November 7, 2018 through February 2, 2019"
  ) +
  labs(caption = credit_caption(FALSE)) +
  guides(fill = guide_legend(nrow = 1, label.position = "bottom")) +
  theme(
    axis.text.y = element_text(family = "PT Mono", size = 10),
    plot.margin = margin(3, 0, 0, 0, unit = "line")
  )

plotly::ggplotly(g + aes(text = label), tooltip = "label") %>% 
  plotly::layout(xaxis = list(side = "top", title = ""))

And that’s it! Jump back up to see the final product.

The President’s Daily Tweeting Schedule

Donald Trump tweets about 5.1 tweets per working day within working hours. Why does this number feel so low? Then again, this represents 290 tweets published over 57 workdays and 128 tweets over the 24 weekend days in the same time period. Outside of work hours, the average rises to 8.7 tweets per 24-hour workday, or a total of 546 tweets on workdays and 203 tweets on weekends.

Below, each tweet sent by the President is shown as a dot over his private schedule. Hover over a tweet’s dot to read the text of the tweet.

View static image of the plot. Expand the section below to learn more about how I gathered tweets from @realDonaldTrump, merged his tweets with the Axios schedules, and added them to the first plot.

Also, if you’re interested in exploring the timeline of tweets rendered as they appear on Twitter, Jonathan Sidi created an awesome Shiny app for exploring Trump’s tweets by category.

How This Was Made…

This visualization required the President’s tweets, which I downloaded using the excellent rtweet package. After matching the tweets with their corresponding activity, I modified the previous plot to soften the coloring of the presidential activities and added the tweets to the chart.

Download the President’s Tweets

Using the rtweet package makes downloading the tweets relatively straightforward. I only needed to add a loop to gather tweets beyond Twitter’s timeline API limits and ensure that I have all the tweets from the time period described in the leaked schedules.

# Change to `if (TRUE)` to run
if (FALSE) {
  djt <- NULL
  keep_going <- TRUE
  while (keep_going) {
    max_id <- if (!is.null(djt$status_id)) max(as.numeric(djt$status_id))
    djt.this <- rtweet::get_timeline("realdonaldtrump", n = 3200, max_id = max_id)
    djt <- bind_rows(djt, djt.this)
    keep_going <- min(djt$created_at) > lubridate::ymd_h("2018-11-07 0", tz = "America/New_York")
    cat("\nWe have", nrow(djt), "tweets...")
    Sys.sleep(15)
  }
  saveRDS(djt, here::here("_data", "djt-tweets-2018-09--2019-02.rds"))
}

Join Tweets with the President’s Schedule

To align the tweets with the President’s schedule, I rounded (actually, floored) the time stamp of each tweet down to the nearest 5 minute interval. Then, I joined the tweets to the schedule using the 5 minute intervals I created while importing the schedule (see above). This gives me the category and scheduled activity of each tweet.

djt_simple <- 
  djt %>% 
  filter(!is_retweet) %>% 
  mutate(
    created_at = with_tz(created_at, tzone = "America/New_York"),
    time_inc = floor_date(created_at, "5 min")
  ) %>% 
  select(created_at, time_inc, text, status_id)

djt_joined_all <- 
  exec_time %>% 
  unnest() %>% 
  filter(time_end != time_inc) %>% 
  full_join(djt_simple, by = "time_inc") %>% 
  select(
    event_id, starts_with("time_"), created_at, 
    listed_title, top_category, text
  ) %>% 
  filter(
    !is.na(text)
  ) %>% 
  distinct(text, .keep_all = TRUE) %>% 
  mutate(
    hour = in_hours(created_at), 
    date = floor_date(created_at, "day"),
    top_category = factor(top_category, executive_categories),
    label_sched = if_else(
      wday(created_at, abbr = FALSE, week_start = 1) >= 6,
      glue("<br>During: <b>Weekend</b>"),
      glue(
        "<br>During: <b>{top_category}</b>",
        "<br>Sched: {strftime(time_start, '%I:%M')} ",
        "to {strftime(time_end, '%I:%M %p')}")
    ),
    label = glue("{str_wrap(text, 50)}",
                 "<br><i>@realDonaldTrump ", 
                 "{strftime(created_at, '%a, %b %e, %I:%M %p')}</i>",
                 "{label_sched}")
  ) %>%
  select(-time_start, -time_end, -label_sched) %>% 
  arrange(created_at)

djt_joined <-
  djt_joined_all %>% 
  filter(!is.na(top_category))

Finally, I still need to do a little bit of processing to get these tweets to fit into the previous plot.

djt_workday <- 
  djt_joined_all %>%
  filter_workday(
    start_var = created_at, end_var = created_at
  ) %>% 
  filter(date >= "2018-11-07", date <= "2019-02-02") %>% 
  select(-text)

At this point, there are a few versions of the tweets data set that I can use in different places. The djt_simple is a basic, bare-bones tibble of tweets.

## # A tibble: 758 x 4
##    created_at          time_inc            text                 status_id  
##    <dttm>              <dttm>              <chr>                <chr>      
##  1 2018-11-07 01:27:01 2018-11-07 01:25:00 “There’s only been … 1060056007…
##  2 2018-11-07 01:37:48 2018-11-07 01:35:00 ....unbelievably lu… 1060058718…
##  3 2018-11-07 01:49:40 2018-11-07 01:45:00 .@DavidAsmanfox  “H… 1060061704…
##  4 2018-11-07 06:21:51 2018-11-07 06:20:00 Received so many Co… 1060130202…
##  5 2018-11-07 06:55:35 2018-11-07 06:55:00 Ron DeSantis showed… 1060138691…
##  6 2018-11-07 07:07:51 2018-11-07 07:05:00 Those that worked w… 1060141780…
##  7 2018-11-07 07:36:28 2018-11-07 07:35:00 I will be doing a n… 1060148982…
##  8 2018-11-07 07:52:39 2018-11-07 07:50:00 To any of the pundi… 1060153052…
##  9 2018-11-07 08:04:02 2018-11-07 08:00:00 If the Democrats th… 1060155917…
## 10 2018-11-07 08:31:24 2018-11-07 08:30:00 In all fairness, Na… 1060162807…
## # … with 748 more rows

The djt_joined_all variable holds the complete set of all tweets joined with the full schedule, meaning that there will be missing values where no tweets occurred in a 5 minute window or where the schedule data doesn’t cover a tweet.

## # A tibble: 1,182 x 9
##    time_inc            top_category text  event_id created_at         
##    <dttm>              <fct>        <chr>    <int> <dttm>             
##  1 2018-10-18 17:20:00 <NA>         Look…       NA 2018-10-18 17:22:53
##  2 2018-10-25 14:40:00 <NA>         Spok…       NA 2018-10-25 14:42:05
##  3 2018-11-05 16:25:00 <NA>         Than…       NA 2018-11-05 16:26:24
##  4 2018-12-20 14:15:00 Executive T… Cong…      323 2018-12-20 14:16:07
##  5 2018-12-21 07:20:00 <NA>         The …       NA 2018-12-21 07:24:18
##  6 2018-12-24 09:30:00 Unknown      Virt…      336 2018-12-24 09:31:50
##  7 2019-01-10 21:40:00 <NA>         We l…       NA 2019-01-10 21:42:46
##  8 2019-01-18 21:10:00 <NA>         http…       NA 2019-01-18 21:14:44
##  9 2019-01-24 08:20:00 Executive T… The …      500 2019-01-24 08:21:59
## 10 2019-01-31 23:25:00 <NA>         This…       NA 2019-01-31 23:26:43
## # … with 1,172 more rows, and 4 more variables: listed_title <chr>,
## #   hour <dbl>, date <dttm>, label <S3: glue>

And djt_workday contains the tweets within the period covered by the Axios schedules and during workday hours.

## # A tibble: 418 x 9
##    time_inc            top_category label event_id created_at         
##    <dttm>              <fct>        <S3:>    <int> <dttm>             
##  1 2018-11-07 08:00:00 Executive T… "If …        1 2018-11-07 08:04:02
##  2 2018-11-07 08:30:00 Executive T… "In …        1 2018-11-07 08:31:24
##  3 2018-11-07 10:35:00 Executive T… "Acc…        1 2018-11-07 10:39:11
##  4 2018-11-07 14:40:00 Executive T… "We …        5 2018-11-07 14:44:11
##  5 2018-11-07 14:40:00 Executive T… "...…        5 2018-11-07 14:44:12
##  6 2018-11-09 09:50:00 Travel       "“Pr…       22 2018-11-09 09:54:15
##  7 2018-11-09 10:55:00 Travel       ".@B…       22 2018-11-09 10:55:25
##  8 2018-11-09 10:55:00 Travel       "You…       22 2018-11-09 10:58:56
##  9 2018-11-09 11:50:00 Travel       "As …       22 2018-11-09 11:52:19
## 10 2018-11-09 12:10:00 Travel       "Jef…       22 2018-11-09 12:10:02
## # … with 408 more rows, and 4 more variables: listed_title <chr>,
## #   hour <dbl>, date <dttm>, weekend <chr>

Add Tweets to the Schedule Plot

To build the second plot, I had to manually tweak the first plot to adjust the transparency of the geom_rect layer and then overlay the tweets as points.

# Modify geom_rect (only layer) to reduce transparency
g_subdued <- g
g_subdued$layers[[1]]$aes_params$alpha <- 0.6

# Add tweets to the plot
g_subdued <- 
  g_subdued + 
  ggtitle("President Trump's Daily Tweeting") +
  geom_point(
    data = djt_workday,
    aes(x = hour, y = date + 3600, text = label),
    color = "#2c3741",
    size = 0.8
  )

At this point, the plot is almost ready to go, except for the fact that the tooltip text will appear for the underlying activity layers. Fortunately, once again StackOverflow comes to the rescue. To disable the tooltip, I have to save the plotly object and change the $hoverinfo value to "none" for each of the data layers of the activity categories.

gpltly <- 
  plotly::ggplotly(g_subdued, tooltip = "label") %>% 
  plotly::layout(xaxis = list(side = "top", title = ""))

# remove hover labels for time category layers (6 categories)
# thanks: https://stackoverflow.com/a/45802923/2022615
for (i in 1:6) {
  gpltly$x$data[[i]]$hoverinfo <- "none"
}

Head back to the visualization to view the final product and explore Trump’s delusional ranting tweets.

How much time is spent in Executive Time?

Looking at the above plots, it’s really striking how much time is unstructured Executive Time in Trump’s schedule. But how much of the day is spent in each activity group?

R code…

The first step is to calculate the total time as a percentage of the 8am to 5pm workday spent in each time category. This data frame will be used for several plots.

One tricky point is that there are piece of the schedule that are explicitly marked as “Unknown” (or “no data”) in the Axios data, so I calculate the total percent of time spent in other categories and subtract this value from 1 to recover the complete unaccounted-for time.

exec_time_total <-
  exec_time %>%
  filter(between(hour(time_start), 8, 17), hour(time_start) < 17) %>%
  mutate(
    date = floor_date(time_start, "day"),
    n = difftime(time_end, time_start, units = "mins"),
    n = as.numeric(n)
  ) %>%
  group_by(date, top_category) %>%
  summarize(pct = sum(n) / (60 * 9)) %>%
  ungroup() %>%
  # Get unaccounted time for each date (unknown or unlabelled)
  nest(-date) %>%
  mutate(
    total = map_dbl(data, ~ {
      filter(., top_category != "Unknown") %>% 
        pull(pct) %>% 
        sum()
    }),
    unaccounted = 1 - total,
  ) %>%
  unnest() %>%
  spread(top_category, pct, fill = 0) %>%
  mutate(`Unknown` = unaccounted) %>%
  select(-total, -unaccounted) %>%
  gather("top_category", "pct", -date) %>%
  mutate(top_category = factor(top_category, rev(names(event_type_colors))))

exec_time_total %>% 
  arrange(date, desc(pct))
## # A tibble: 378 x 3
##    date                top_category      pct
##    <dttm>              <fct>           <dbl>
##  1 2018-11-07 00:00:00 Executive Time 0.833 
##  2 2018-11-07 00:00:00 Lunch          0.111 
##  3 2018-11-07 00:00:00 Meeting        0.0556
##  4 2018-11-07 00:00:00 Event          0     
##  5 2018-11-07 00:00:00 Unknown        0     
##  6 2018-11-07 00:00:00 Travel         0     
##  7 2018-11-08 00:00:00 Executive Time 0.5   
##  8 2018-11-08 00:00:00 Meeting        0.222 
##  9 2018-11-08 00:00:00 Lunch          0.111 
## 10 2018-11-08 00:00:00 Unknown        0.0833
## # … with 368 more rows

The above tibble contains a summary of time use by day, but the first plot requires a full summary of all days in the schedule. The following code chunk caculates total hours spent in each group and creates a text label that is used to label the regions of the stacked bar in the plot.

exec_time_hours <- 
  exec_time_total %>% 
  # Only the days covered by the schedule
  filter(!(pct == 1 & top_category == "Unknown")) %>% 
  group_by(top_category) %>% 
  summarize(hours = sum(pct * 60 * 90)) %>% 
  arrange(desc(top_category)) %>% 
  mutate(
    pct = hours / sum(hours),
    pct_upto = cumsum(pct),
    label = glue("{top_category}\n",
                 "{scales::percent(pct, accuracy = 1)}"),
    label = if_else(top_category == "Unknown", "", paste(label))
  )

exec_time_hours
## # A tibble: 6 x 5
##   top_category     hours    pct pct_upto label                
##   <fct>            <dbl>  <dbl>    <dbl> <chr>                
## 1 Executive Time 162650  0.591     0.591 "Executive Time\n59%"
## 2 Travel          23350  0.0848    0.675 "Travel\n8%"         
## 3 Lunch           23400  0.0850    0.760 "Lunch\n8%"          
## 4 Meeting         45350  0.165     0.925 "Meeting\n16%"       
## 5 Event           13850  0.0503    0.975 "Event\n5%"          
## 6 Unknown          6800. 0.0247    1     ""

Finally, I create the plot using geom_col() to create a stacked bar chart, that I then rotate to be horizontal with coord_flip(). The bar labels are added as a text annotation, and I do a little adjustment to make sure the annotations fit in the plot and to hide the axis that aren’t relevant.

ggplot(exec_time_hours) +
  aes(x = 1, 
      y = pct,
      fill = top_category) +
  geom_col() +
  geom_text(
    aes(x = 0.35, y = pct_upto - pct/2, label = label),
    color = "grey30",
    family = "PT Sans"
  ) +
  scale_fill_manual(
    values = event_type_colors,
    labels = rev(event_type_labels),
    guide = FALSE
  ) +
  scale_x_continuous(
    expand = expand_scale(0, c(0.2, 0))
  ) +
  scale_y_continuous(
    labels = scales::percent_format(accuracy = 10),
    expand = expand_scale(0, 0),
    position = "bottom"
  ) +
  coord_flip() +
  labs(
    x = NULL, 
    y = "Percent of Workday Between 8am and 5pm", 
    fill = NULL
  ) +
  ggtitle(
    "What Does President Trump Do With His Time?"
  ) +
  labs(caption = credit_caption(FALSE)) +
  theme(
    axis.text.y = element_blank(),
    panel.grid.major = element_blank(),
    axis.ticks.x.top = element_line(color = "grey20")
  )

For 43 of the 51 workdays (that’s 84.3%) covered by the Axios schedules and for which there is schedule information, Trump spent 50% or more of his day in executive time.

In other words, there were only 8 days in about 10 work weeks where executive time was not the dominant activity.

When the above time-use summary is expanded into his daily schedule, it’s clear how unusual it is for Trump to spend a significant portion of his day in structured events.

R code…
ggplot(exec_time_total) +
  aes(date + 3600*12, pct, fill = top_category) +
  geom_col(width = 3600*24) +
  scale_fill_manual(
    values = event_type_colors, 
    labels = rev(event_type_labels)
  ) +
  scale_y_continuous(
    breaks   = seq(0, 1, .25),
    labels   = scales::percent_format(accuracy = 25),
    position = "bottom",
    expand   = expand_scale(c(0.025, 0), 0)
  ) +
  scale_x_continuous(
    trans  = rev_date,
    breaks = plot_date_breaks,
    labels = strftime(plot_date_breaks, "%b %e"),
    expand = expand_scale(c(0.025, 0), 0)
  ) +
  coord_flip() +
  labs(
    x = NULL, 
    y = "Percent of Workday Between 8am and 5pm", 
    fill = NULL
  ) +
  guides(
    fill = guide_legend(nrow = 1, 
                        reverse = TRUE, 
                        label.position = "bottom")
  ) +
  ggtitle(
    "What Does President Trump Do With His Time?"
  ) +
  labs(caption = credit_caption(FALSE))

In fact, the largest non-executive time block for the 8 days where executive time isn’t more than half of Trump’s workday are almost entirely travel related.

R code…
exec_time_summary %>%
  filter(pct < 0.5, weekday) %>%
  select(date) %>%
  left_join(exec_time %>% mutate(date = floor_date(time_start, "day"))) %>%
  mutate(duration = difftime(time_end, time_start, unit = "hours")) %>%
  select(date, duration, listed_title) %>%
  arrange(date, desc(duration)) %>%
  filter(listed_title != "Executive time") %>%
  group_by(date) %>%
  slice(1) %>% 
  mutate(duration = round(duration, 2)) %>% 
  knitr::kable(col.names = c(
    "Date", "Duration", "Longest Non-Executive-Time Activity"
  ), format = "html") %>% 
  kableExtra::column_spec(1:2, width = "6.5em") 
Date Duration Longest Non-Executive-Time Activity
2018-11-09 6.50 hours Depart Washington, DC en route Orly, France
2018-11-26 2.17 hours Depart Gulfport, MS en route Washington, DC
2018-11-29 6.67 hours Depart Washington, DC en route Buenos Aires, Argentina
2018-11-30 1.75 hours G20 Leaders’ dinner
2018-12-07 2.58 hours Depart Washington, DC en route Kansas City, MO
2018-12-21 1.00 hours Lunch
2019-01-10 4.17 hours Depart Washington, DC en route McAllen, TX
2019-01-14 2.58 hours Depart Washington, DC en route Kenner, LA

Travel seems to be the only activity capable of substantially affecting the amount of time the president spends on his executive time. My (completely speculative) guess is that this is in part due to travel being the only activity with a duration long enough to displace executive time, and also in part that travel probably most resembles executive time.

R code…
exec_time_total %>%
  # Drop "Unkown" time category, not that important
  filter(top_category != "Unknown") %>%
  # Spread top_category... 
  spread(top_category, pct) %>%
  # ...and gather to leave Executive Time in own column
  gather(other_activity, pct, -date, -`Executive Time`) %>%
  # Ignore points where both groups are 0% (not informative)
  filter(pct + `Executive Time` > 0) %>% 
  # Pipe into ggplot
  ggplot() + 
  aes(`Executive Time`, pct, color = other_activity) + 
  geom_point() + 
  facet_wrap(~ other_activity, nrow = 1) +
  scale_x_continuous(
    breaks = c(0, 0.5, 1),
    labels = scales::percent_format(accuracy = 25),
    limits = c(0, 1)
  ) +
  scale_y_continuous(
    breaks = c(0, 0.5, 1),
    labels = scales::percent_format(accuracy = 25),
    limits = c(0, 1)
  ) +
  scale_color_manual(
    values = event_type_colors, 
    labels = rev(event_type_labels),
    guide  = FALSE
  ) +
  coord_flip() +
  labs(
    x = "Percent of Workday\nIn Executive Time",
    y = "Percent of Workday Spent in Activity",
    caption = credit_caption()
  ) +
  theme(
    axis.title.x = element_text(margin = margin(10)),
    axis.title.y = element_text(margin = margin(r = 20)),
  )

Tweeter In Chief

At the point, I was very interested in exploring how Trump’s tweeting relates to his work schedule. The first question to answer is When does he send most of his tweets? And the answer is: primarily on the weekends, during executive time, or before or after work hours.

R code…
# Start and end dates of Axios-pubslished schedules
# which I called `exec_time` for some reason and am sticking with
exec_time_boundaries <- 
  exec_time %>% 
  summarize(min = min(time_start), max = max(time_end))

exec_time %>%
  # mutate(event_id = row_number()) %>%
  unnest() %>%
  filter(time_end != time_inc) %>%
  full_join(djt_simple, by = "time_inc") %>%
  select(event_id, time_inc, created_at, listed_title, top_category, text) %>%
  filter(
    !is.na(text),
    between(time_inc, exec_time_boundaries$min, exec_time_boundaries$max)
  ) %>%
  mutate(
    wday = wday(created_at, abbr = FALSE, week_start = 1), 
    top_category = case_when(
      !is.na(top_category) ~ paste(top_category),
      wday > 5 ~ "Weekend", 
      between(wday, 1, 5) & hour(created_at) <  6 ~ "Early Morning (before 6am)",
      between(wday, 1, 5) & hour(created_at) <  8 ~ "Morning (6-8 am)",
      between(wday, 1, 5) & hour(created_at) > 17 ~ "Evening (after 5pm)",
      is.na(top_category) ~ "Unknown",
      TRUE ~ paste(top_category))
  ) %>%
  count(top_category) %>% 
  arrange(n) %>% 
  mutate(top_category = fct_inorder(top_category)) %>% 
  ggplot() + 
  aes(top_category, n) + 
  geom_col(fill = "#445566") + 
  scale_y_continuous(expand = c(0, 0, 0, 5)) +
  coord_flip() +
  theme(panel.grid.major.y = element_blank()) +
  labs(x = "Activity or Time of Day", 
       y = paste(
         "Total Tweets Sent Between",
         strftime(exec_time_boundaries$min, "%F"),
         "to",
         strftime(exec_time_boundaries$max, "%F")
       ),
       title = "Trump Tweet Volume by Scheduled Activity",
       caption = credit_caption(rtweet = TRUE))

We can get a sense of the timing of Trump’s tweeting activities by looking at the time of day of each tweet and the scheduled activity that’s going on at the time. The following plot shows each tweet as a vertical line and considers only workday tweeting and only for days covered by the Axios schedules.

R code…
event_type_colors_extra <- c(event_type_colors, "Non-Work Hours" = "#828486")

djt_joined_work_non_work <- 
  djt_joined_all %>% 
  mutate_at(vars(top_category),  as.character) %>% 
  mutate(weekend = wday(created_at, abbr = FALSE, week_start = 1) > 5) %>%
  filter(
    !weekend,
    between(created_at, exec_time_boundaries$min, exec_time_boundaries$max)
  ) %>% 
  mutate(top_category = if_else(
    between(in_hours(created_at), 8, 17) & is.na(top_category),
    "Unknown",
    top_category
  )) %>% 
  replace_na(list(top_category = "Non-Work Hours")) %>% 
  mutate(
    top_category = fct_infreq(top_category)
  )
ggplot(djt_joined_work_non_work) + 
  aes(x = hour, y = 1, color = top_category) +
  geom_segment(aes(xend = hour, yend = 0), alpha = 0.6, size = 0.5) +
  facet_wrap(~ top_category, ncol = 1, strip.position = "left") +
  scale_x_continuous(
    position = "bottom",
    breaks   = seq(0, 24, 4),
    limits   = c(0, 24),
    labels   = am_pm(seq(0, 24, 4)),
    expand   = expand_scale(c(0.025, 0), 0)
  ) +
  scale_color_manual(
    values = event_type_colors_extra, 
    labels = names(event_type_colors_extra)
  ) +
  scale_fill_manual(
    values = event_type_colors_extra, 
    labels = names(event_type_colors_extra)
  ) +
  coord_cartesian(clip = "off") +
  labs(x = NULL, y = NULL, color = NULL) +
  guides(color = FALSE, fill = FALSE) +
  theme(
    panel.grid.major.y = element_blank(),
    axis.text.y = element_blank(),
    strip.text.y = element_text(angle = 180, 
                                margin = margin(r = 5, l = 25), 
                                hjust = 1),
    panel.spacing.y = unit(0, "pt")
  ) +
  ggtitle(
    "What's On His Schedule When He's Tweeting?",
    "Each line represents a tweet, colored by the activity on his White House Schedule"
  ) + 
  labs(caption = credit_caption(rtweet = TRUE))

Most of Trump’s tweeting happens betewen 7 and 9 am, but what’s striking is that it’s nearly impossible to tell the difference between early morning tweeting and the start of President Trump’s official workday at 8am.

R code…
djt_joined_work_non_work %>% 
  filter(top_category %in% c("Non-Work Hours", "Executive Time")) %>% 
  mutate(week = floor_date(created_at, "week"),
         week = strftime(week, "%F")) %>% 
  ggplot() + 
  aes(x = hour, y = 1, color = top_category) +
  geom_segment(aes(xend = hour, yend = 0)) +
  facet_wrap(~ week, ncol = 1, strip.position = "left") +
  scale_x_continuous(
    position = "bottom",
    breaks   = seq(0, 24, 2),
    limits   = c(4, 12),
    labels   = am_pm(seq(0, 24, 2)),
    expand   = expand_scale(c(0.025, 0), 0)
  ) +
  scale_color_manual(
    values = event_type_colors_extra, 
    labels = names(event_type_colors_extra)
  ) +
  scale_fill_manual(
    values = event_type_colors_extra, 
    labels = names(event_type_colors_extra)
  ) +
  labs(x = NULL, y = NULL, color = NULL, caption = credit_caption(TRUE)) +
  guides(color = FALSE, fill = FALSE) +
  theme(
    panel.grid.major.y = element_blank(),
    axis.text.y = element_blank(),
    strip.text.y = element_text(angle = 180, margin = margin(r = 25)),
    panel.spacing.y = unit(0, "pt")
  ) +
  ggtitle(
    "When Do Official Work Hours Start?",
    "Morning tweets published over one week periods.\nAccording to the White House Schedule, \"Executive Time\" starts at 8am in the Oval Office."
  )

As we learned above, Trump sends about 5 tweets per working day within working hours. Naturally, I wondered if he tends to tweet more or less during the day when he has more executive or travel time available. Similarly does he tweet less when he has more strucured time, i.e. metings, events, or lunches?

Somewhat unsurprisingly, the number of tweets sent during the workday in only very slightly correlated with the amount of unstructured time on Trump’s calendar. This makes sense: there is very little variation in the amount of the day spent in structured events – it’s never more than half the day.

R code…
djt_joined %>%
  group_by(date) %>%
  count() %>%
  rename(tweets = n) %>%
  left_join(exec_time_total, ., by = "date") %>%
  replace_na(list(tweets = 0)) %>% 
  filter(top_category %in% c("Executive Time", "Unknown", "Travel", "Lunch")) %>%
  spread(top_category, pct, fill = 0) %>% 
  filter(!Unknown == 1) %>% 
  mutate(pct = Unknown + `Executive Time` + Travel + Lunch) %>% 
  ggplot() +
  aes(pct, tweets) +
  geom_smooth(
    method = "lm", 
    color = event_type_colors["Executive Time"], 
    fill = event_type_colors["Unknown"]
  ) +
  geom_point(color = event_type_colors["Executive Time"]) + 
  scale_x_continuous(labels = scales::percent_format(10)) +
  labs(
    x = "Percent of Workday Dedicated to Downtime\n(Executive Time, Travel, Unknown)", 
    y = "Number of Tweets", 
    title = "Does Trump Tweet More When He Has More Downtime?",
    caption = credit_caption(rtweet = TRUE))

R code…
djt_simple %>%
  mutate(date = floor_date(created_at, "day")) %>%
  group_by(date) %>%
  count() %>%
  rename(tweets = n) %>%
  left_join(exec_time_total, ., by = "date") %>%
  filter(top_category %in% c("Meeting", "Event", "Lunch")) %>%
  group_by(date) %>% 
  summarize(pct = sum(pct), tweets = max(tweets)) %>% 
  filter(pct > 0) %>% 
  ggplot() +
  aes(pct, tweets) +
  geom_smooth(
    method = "lm", 
    color = event_type_colors["Executive Time"], 
    fill = event_type_colors["Unknown"]
  ) +
  geom_point(color = event_type_colors["Executive Time"]) +
  scale_x_continuous(labels = scales::percent_format(10)) +
  labs(x = "Percent of Workday Dedicated to Meetings, Lunches, or Events", 
       y = "Number of Tweets",
       title = "Does Trump Tweet Less When He Does More \"Work\"?",
       caption = credit_caption(rtweet = TRUE))

Finally, I wanted to explore the emotional valence of Trump’s day-time tweeting. Are his morning tweets angrier or more rant-driven? Are his event-related tweets more positive?

To this end, I ran Trump’s tweet text through the NRC sentiment dictionary using get_nrc_sentiment() from the syuzhet package. This function returns an integer score from 0 to 10 for a range of positive and negative emotions.

R code…
djt_sentiment <- 
  djt_joined %>% 
  select(top_category, text) %>% 
  mutate(sentiment = map(text, syuzhet::get_nrc_sentiment)) %>% 
  unnest() %>% 
  gather(emotion, value, -top_category, -text)

djt_sentiment
## # A tibble: 2,950 x 4
##    top_category   text                                        emotion value
##    <fct>          <chr>                                       <chr>   <dbl>
##  1 Executive Time If the Democrats think they are going to w… anger       1
##  2 Executive Time In all fairness, Nancy Pelosi deserves to … anger       0
##  3 Executive Time According to NBC News, Voters Nationwide D… anger       4
##  4 Executive Time We are pleased to announce that Matthew G.… anger       1
##  5 Executive Time ....We thank Attorney General Jeff Session… anger       1
##  6 Travel         “Presidential Proclamation Addressing Mass… anger       0
##  7 Travel         .@BrianKempGA ran a great race in Georgia … anger       0
##  8 Travel         You mean they are just now finding votes i… anger       2
##  9 Travel         As soon as Democrats sent their best Elect… anger       2
## 10 Travel         Jeff Flake(y) doesn’t want to protect the … anger       2
## # … with 2,940 more rows
djt_sentiment_mean <- 
  djt_sentiment %>% 
  group_by(top_category, emotion) %>% 
  summarize(value = mean(value)) %>% 
  mutate(emotion = fct_reorder(emotion, value)) %>% 
  filter(top_category != "Unknown")

djt_sentiment_mean
## # A tibble: 50 x 3
## # Groups:   top_category [5]
##    top_category   emotion      value
##    <fct>          <fct>        <dbl>
##  1 Executive Time anger        0.864
##  2 Executive Time anticipation 0.870
##  3 Executive Time disgust      0.525
##  4 Executive Time fear         0.772
##  5 Executive Time joy          0.654
##  6 Executive Time negative     1.36 
##  7 Executive Time positive     2.01 
##  8 Executive Time sadness      0.562
##  9 Executive Time surprise     0.537
## 10 Executive Time trust        1.52 
## # … with 40 more rows
emotions <- c(
  "positive", "joy", "trust", "surprise", "anticipation", 
  "sadness", "anger", "fear", "disgust", "negative"
)

djt_sentiment %>%
  mutate(emotion = factor(emotion, rev(emotions))) %>%
  ggplot() +
  aes(y = value, x = emotion, fill = top_category) +
  # ggridges::geom_density_ridges() +
  geom_boxplot(alpha = 0.7, color = "grey20", outlier.shape = NA) +
  scale_fill_manual(values = event_type_colors) +
  guides(fill = FALSE, color = FALSE) + 
  coord_flip() +
  facet_wrap(~top_category, scales = "free_y") + 
  labs(y = "Sentiment Score", x = NULL,
       title = "Emotions Expressed in Trump Tweets",
       caption = credit_caption(rtweet = TRUE))

The result provides something of a profile of Trump’s tweeting habits, but more analysis is needed to make sense of these sentiment values. I wanted to look further into how these tweets were categorized by the sentiment dictionary, but by this point this post is already far too long and has consumed too much of my evenings and weekends, so I’ll save it for another day.


Thanks for reading! I’d love to hear your thoughts or feedback. I’m @grrrck on Twitter.