Naming Items in a List with {purrr}, {dplyr}, or {healthyR}

code
rtip
purrr
healthyr
Author

Steven P. Sanderson II, MPH

Published

December 5, 2022

Introduction

Many times when we are working with a data set we will want to break it up into groups and place them into a list and work with them in that fashion. With this it can be useful to the elements of the list named by the column that the data was split upon. Let’s use the iris set as an example where we split on Species.

There are two main functions that we will use in this scenario, namely purrr:map() and dplyr::group_split(), you could also use the split function from base r for this.

We will also go over how simple this is using the {healthyR} package. Let’s look at the function from {healthyR}

Function

Full function call.

named_item_list(.data, .group_col)

There are only two arguments to supply.

  • .data - The data.frame/tibble.
  • .group_col - The column that contains the groupings.

That’s it.

Examples

Let’s jump into it.

library(purrr)
library(dplyr)

data_tbl <- iris

data_tbl_list <- data_tbl %>%
  group_split(Species)

data_tbl_list
<list_of<
  tbl_df<
    Sepal.Length: double
    Sepal.Width : double
    Petal.Length: double
    Petal.Width : double
    Species     : factor<fb977>
  >
>[3]>
[[1]]
# A tibble: 50 × 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
 1          5.1         3.5          1.4         0.2 setosa 
 2          4.9         3            1.4         0.2 setosa 
 3          4.7         3.2          1.3         0.2 setosa 
 4          4.6         3.1          1.5         0.2 setosa 
 5          5           3.6          1.4         0.2 setosa 
 6          5.4         3.9          1.7         0.4 setosa 
 7          4.6         3.4          1.4         0.3 setosa 
 8          5           3.4          1.5         0.2 setosa 
 9          4.4         2.9          1.4         0.2 setosa 
10          4.9         3.1          1.5         0.1 setosa 
# … with 40 more rows

[[2]]
# A tibble: 50 × 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species   
          <dbl>       <dbl>        <dbl>       <dbl> <fct>     
 1          7           3.2          4.7         1.4 versicolor
 2          6.4         3.2          4.5         1.5 versicolor
 3          6.9         3.1          4.9         1.5 versicolor
 4          5.5         2.3          4           1.3 versicolor
 5          6.5         2.8          4.6         1.5 versicolor
 6          5.7         2.8          4.5         1.3 versicolor
 7          6.3         3.3          4.7         1.6 versicolor
 8          4.9         2.4          3.3         1   versicolor
 9          6.6         2.9          4.6         1.3 versicolor
10          5.2         2.7          3.9         1.4 versicolor
# … with 40 more rows

[[3]]
# A tibble: 50 × 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species  
          <dbl>       <dbl>        <dbl>       <dbl> <fct>    
 1          6.3         3.3          6           2.5 virginica
 2          5.8         2.7          5.1         1.9 virginica
 3          7.1         3            5.9         2.1 virginica
 4          6.3         2.9          5.6         1.8 virginica
 5          6.5         3            5.8         2.2 virginica
 6          7.6         3            6.6         2.1 virginica
 7          4.9         2.5          4.5         1.7 virginica
 8          7.3         2.9          6.3         1.8 virginica
 9          6.7         2.5          5.8         1.8 virginica
10          7.2         3.6          6.1         2.5 virginica
# … with 40 more rows
data_tbl_list %>%
   map( ~ pull(., Species)) %>%
   map( ~ as.character(.)) %>%
   map( ~ unique(.))
[[1]]
[1] "setosa"

[[2]]
[1] "versicolor"

[[3]]
[1] "virginica"

Now lets go ahead and apply the names.

names(data_tbl_list) <- data_tbl_list %>%
   map( ~ pull(., Species)) %>%
   map( ~ as.character(.)) %>%
   map( ~ unique(.))

data_tbl_list
<list_of<
  tbl_df<
    Sepal.Length: double
    Sepal.Width : double
    Petal.Length: double
    Petal.Width : double
    Species     : factor<fb977>
  >
>[3]>
$setosa
# A tibble: 50 × 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
 1          5.1         3.5          1.4         0.2 setosa 
 2          4.9         3            1.4         0.2 setosa 
 3          4.7         3.2          1.3         0.2 setosa 
 4          4.6         3.1          1.5         0.2 setosa 
 5          5           3.6          1.4         0.2 setosa 
 6          5.4         3.9          1.7         0.4 setosa 
 7          4.6         3.4          1.4         0.3 setosa 
 8          5           3.4          1.5         0.2 setosa 
 9          4.4         2.9          1.4         0.2 setosa 
10          4.9         3.1          1.5         0.1 setosa 
# … with 40 more rows

$versicolor
# A tibble: 50 × 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species   
          <dbl>       <dbl>        <dbl>       <dbl> <fct>     
 1          7           3.2          4.7         1.4 versicolor
 2          6.4         3.2          4.5         1.5 versicolor
 3          6.9         3.1          4.9         1.5 versicolor
 4          5.5         2.3          4           1.3 versicolor
 5          6.5         2.8          4.6         1.5 versicolor
 6          5.7         2.8          4.5         1.3 versicolor
 7          6.3         3.3          4.7         1.6 versicolor
 8          4.9         2.4          3.3         1   versicolor
 9          6.6         2.9          4.6         1.3 versicolor
10          5.2         2.7          3.9         1.4 versicolor
# … with 40 more rows

$virginica
# A tibble: 50 × 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species  
          <dbl>       <dbl>        <dbl>       <dbl> <fct>    
 1          6.3         3.3          6           2.5 virginica
 2          5.8         2.7          5.1         1.9 virginica
 3          7.1         3            5.9         2.1 virginica
 4          6.3         2.9          5.6         1.8 virginica
 5          6.5         3            5.8         2.2 virginica
 6          7.6         3            6.6         2.1 virginica
 7          4.9         2.5          4.5         1.7 virginica
 8          7.3         2.9          6.3         1.8 virginica
 9          6.7         2.5          5.8         1.8 virginica
10          7.2         3.6          6.1         2.5 virginica
# … with 40 more rows

Let’s now see how we do this in {healthyR}

library(healthyR)

named_item_list(iris, Species)
<list_of<
  tbl_df<
    Sepal.Length: double
    Sepal.Width : double
    Petal.Length: double
    Petal.Width : double
    Species     : factor<fb977>
  >
>[3]>
$setosa
# A tibble: 50 × 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
 1          5.1         3.5          1.4         0.2 setosa 
 2          4.9         3            1.4         0.2 setosa 
 3          4.7         3.2          1.3         0.2 setosa 
 4          4.6         3.1          1.5         0.2 setosa 
 5          5           3.6          1.4         0.2 setosa 
 6          5.4         3.9          1.7         0.4 setosa 
 7          4.6         3.4          1.4         0.3 setosa 
 8          5           3.4          1.5         0.2 setosa 
 9          4.4         2.9          1.4         0.2 setosa 
10          4.9         3.1          1.5         0.1 setosa 
# … with 40 more rows

$versicolor
# A tibble: 50 × 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species   
          <dbl>       <dbl>        <dbl>       <dbl> <fct>     
 1          7           3.2          4.7         1.4 versicolor
 2          6.4         3.2          4.5         1.5 versicolor
 3          6.9         3.1          4.9         1.5 versicolor
 4          5.5         2.3          4           1.3 versicolor
 5          6.5         2.8          4.6         1.5 versicolor
 6          5.7         2.8          4.5         1.3 versicolor
 7          6.3         3.3          4.7         1.6 versicolor
 8          4.9         2.4          3.3         1   versicolor
 9          6.6         2.9          4.6         1.3 versicolor
10          5.2         2.7          3.9         1.4 versicolor
# … with 40 more rows

$virginica
# A tibble: 50 × 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species  
          <dbl>       <dbl>        <dbl>       <dbl> <fct>    
 1          6.3         3.3          6           2.5 virginica
 2          5.8         2.7          5.1         1.9 virginica
 3          7.1         3            5.9         2.1 virginica
 4          6.3         2.9          5.6         1.8 virginica
 5          6.5         3            5.8         2.2 virginica
 6          7.6         3            6.6         2.1 virginica
 7          4.9         2.5          4.5         1.7 virginica
 8          7.3         2.9          6.3         1.8 virginica
 9          6.7         2.5          5.8         1.8 virginica
10          7.2         3.6          6.1         2.5 virginica
# … with 40 more rows

If you use this in conjunction with the healthyR function save_to_excel() then it will write an excel file with a tab for each named item in the list.

Voila!