tidyverse

Grouped Sequences in dplyr Part 2

I just wrote a post about grouped sequences in dplyr and following that, I’ve been made aware of another couple of solutions to this problem (credit John Mackintosh). The solution involves using the consecutive_id() function, available in dplyr since v1.1.0. In the help page for this function, it’s mentioned that it was inspired by rleid() function from the data.table package. These functions work similarly to the rle() function I used last time (in what I called ‘the complicated solution’) but provide neater outputs.

Grouped Sequences in dplyr

For a piece of work I had to calculate the number of matches that a team plays away from home in a row, which we will call days_on_the_road. I was not sure how to do this with dplyr but it’s basically a ‘grouped sequence’. For this post, I’ve created some dummy data to illustrate this idea. The num_matches_away variable is what we want to mimic using some data manipulation.

A couple of case_when() tricks

Combining case_when() and across() If you want to use case_when() and across() different variables, then here is an example that can do this with the help of the get() and cur_column() functions. library(tidyverse) iris_df <- as_tibble(iris) %>% mutate(flag_Petal.Length = as.integer(Petal.Length > 1.5), flag_Petal.Width = as.integer(Petal.Width > 0.2)) iris_df %>% mutate(across(c(Petal.Length, Petal.Width), ~case_when( get(glue::glue("flag_{cur_column()}")) == 1 ~ NA_real_, TRUE ~ .x ))) %>% select(contains("Petal")) ## # A tibble: 150 × 4 ## Petal.

Summarising Dates with Missing Values

This blog post is just a note that when you try to do a grouped summary of a date variable but some groups have all missing values, it will return Inf. This means that the summary will not show up as an NA and this can cause issues in analysis if you are not careful. library(tidyverse) df <- tibble::tribble( ~id, ~dt, 1L, "01/01/2001", 1L, NA, 2L, NA, 2L, NA ) %>% mutate(dt = dmy(dt)) z1 <- df %>% group_by(id) %>% summarise(dt_min = min(dt, na.

R Training Resources and Tips

A short list of resources and tips to help with learning some R basics, with particular focus on the tidyverse.