Converting a {tidyAML} tibble to a {workflowsets}

code
rtip
tidyaml
workflowsets
tidymodels
Author

Steven P. Sanderson II, MPH

Published

February 17, 2023

Introduction

The {tidyAML} package is an R package that provides a set of tools for building regression/classification models on the fly with minimal input required. In this post we will discuss the create_workflow_set() function.

The create_workflow_set function is a function in the tidyAML package that is used to create a workflowset object from the workflowsets package. A workflow is a sequence of tasks that can be executed in a specific order, and is often used in data analysis and machine learning to automate data processing and model fitting. The create_workflow_set function takes as input a YAML specification of a set of workflows, and returns a list of workflow objects that can be executed using the tidymodels package and its associated packages.

The create_workflow_set function is particularly useful when working with the tidymodels package and the parsnip framework. The tidymodels package is a collection of packages for modeling and machine learning in R that provides a consistent interface for building, tuning, and evaluating machine learning models. The parsnip package is part of the tidymodels ecosystem and provides a way to specify a wide range of models in a consistent manner.

Using the create_workflow_set function with tidymodels and parsnip

To use the create_workflow_set function with tidymodels andparsnip, you will need to provide a recipe or recipes as a list to the .recipe_list parameter and a model_spec tibble that you would get from something like fast_regression_parsnip_spec_tbl(), other classes will be supported in the future.

The reason this was done was because I did not want to force users to remain inside of tidyAML perhaps and most likely there are other packages out there that are more suited to an end users specific problem at hand.

Function

Let’s take a look at the function and it’s arguments.

create_workflow_set(
  .model_tbl = NULL, 
  .recipe_list = list(), 
  .cross = TRUE
)
  • .model_tbl - The model table that is generated from a function like fast_regression_parsnip_spec_tbl(). The model spec column will be grabbed automatically as the class of the object must be tidyaml_base_tbl
  • .recipe_list - Provide a list of recipes here that will get added to the workflow set object.
  • .cross - The default is TRUE, can be set to FALSE. This is passed to the cross parameter as an argument to the workflow_set() function.

Example

Here is a simple example. Remember you really only want to use this if you have a model_spec tibble not a tibble with workflows that have already been fit.

library(tidyAML)
library(recipes)

rec_obj <- recipe(mpg ~ ., data = mtcars)
spec_tbl <- fast_regression_parsnip_spec_tbl(
  .parsnip_fns = "linear_reg",
  .parsnip_eng = c("lm","glm")
)

wfs_tbl <- create_workflow_set(
  spec_tbl,
  list(rec_obj)
)

Now let’s inspect.

library(dplyr)

wfs_tbl |>
  slice(1)
# A workflow set/tibble: 1 × 4
  wflow_id            info             option    result    
  <chr>               <list>           <list>    <list>    
1 recipe_linear_reg_1 <tibble [1 × 4]> <opts[0]> <list [0]>
class(wfs_tbl)
[1] "workflow_set" "tbl_df"       "tbl"          "data.frame"  
wfs_tbl$info
[[1]]
# A tibble: 1 × 4
  workflow   preproc model      comment
  <list>     <chr>   <chr>      <chr>  
1 <workflow> recipe  linear_reg ""     

[[2]]
# A tibble: 1 × 4
  workflow   preproc model      comment
  <list>     <chr>   <chr>      <chr>  
1 <workflow> recipe  linear_reg ""     
wfs_tbl$info[[1]]$workflow[[1]]
══ Workflow ════════════════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: linear_reg()

── Preprocessor ────────────────────────────────────────────────────────────────
0 Recipe Steps

── Model ───────────────────────────────────────────────────────────────────────
Linear Regression Model Specification (regression)

Computational engine: lm 

Voila!