Simplifying List Filtering in R with purrr’s keep()

code
rtip
lists
purrr
Author

Steven P. Sanderson II, MPH

Published

January 25, 2023

Introduction

The {purrr} package in R is a powerful tool for working with lists and other data structures. One particularly useful function in the package is keep(), which allows you to filter a list by keeping only the elements that meet certain conditions.

The keep() function takes two arguments: the list to filter, and a function that returns a logical value indicating whether each element of the list should be kept. The function can be specified as an anonymous function or a named function, and it should take a single argument (the current element of the list).

For example, let’s say we have a list of numbers and we want to keep only the even numbers. We could use the keep() function with an anonymous function that checks the remainder of the current element divided by 2:

library(purrr)

numbers <- c(1, 2, 3, 4, 5, 6)
even_numbers <- keep(numbers, function(x) x %% 2 == 0)
even_numbers
[1] 2 4 6

We see that this keeps [1] 2 4 6.

The purrr package also provides a convenient shorthand for this operation, .p, which can be used inside the keep function to return the element.

even_numbers <- keep(numbers, ~ .x %% 2 == 0)
even_numbers
[1] 2 4 6

You can also use the keep() function to filter a list of other types of objects, such as strings or lists. For example, you could use it to keep only the strings that are longer than a certain length:

words <- c("cat", "dog", "elephant", "bird")
long_words <- keep(words, function(x) nchar(x) > 4)
long_words
[1] "elephant"

We see that this keeps “elephant” & “bird”.

In summary, the {purrr} package’s keep() function is a powerful tool for filtering lists in R, and the .p parameter can be used as a shorthand. It can be used to keep only the items in a list that meet a user-given condition, and it can be used with a variety of data types.

Function

Here is the keep() function and it’s parameters.

keep(.x, .p, ...)

Here are the arguments to the parameters.

  • .x - A list or vector.
  • .p - A predicate function (i.e. a function that returns either TRUE or FALSE) specified in one of the following ways:
    • A named function, e.g. is.character.
    • An anonymous function, e.g. \(x) all(x < 0) or function(x) all(x < 0).
    • A formula, e.g. ~ all(.x < 0). You must use .x to refer to the first argument). Only recommended if you require backward compatibility with older versions of R.
  • ... - Additional arguments passed on to .p.

Examples

I recently came across wanting to filter a list that is given as an argument to a parameter. The function I am working for my upcoming {tidyAML} package has a function called create_workflow_set() that has a parameter .recipe_list which is set to list(). The user must only place recipes in this list or else I want it to fail. So I was able to write a quick check using keep() like so:

# Checks ----
# only keep() recipes
rec_list <- purrr::keep(rec_list, ~ inherits(.x, "recipe"))

Voila!