Skip to contents

Takes a numeric vector and will return a tibble with the winsorized values.

Usage

hai_winsorized_truncate_augment(.data, .value, .fraction, .names = "auto")

Arguments

.data

The data being passed that will be augmented by the function.

.value

This is passed rlang::enquo() to capture the vectors you want to augment.

.fraction

A positive fractional between 0 and 0.5 that is passed to the stats::quantile paramater of probs.

.names

The default is "auto"

Value

An augmented tibble

Details

Takes a numeric vector and will return a winsorized vector of values that have been truncated if they are less than or greater than some defined fraction of a quantile. The intent of winsorization is to limit the effect of extreme values.

Author

Steven P. Sanderson II, MPH

Examples

suppressPackageStartupMessages(library(dplyr))

len_out <- 24
by_unit <- "month"
start_date <- as.Date("2021-01-01")

data_tbl <- tibble(
  date_col = seq.Date(from = start_date, length.out = len_out, by = by_unit),
  a = rnorm(len_out),
  b = runif(len_out)
)

hai_winsorized_truncate_augment(data_tbl, a, .fraction = 0.05)
#> # A tibble: 24 × 4
#>    date_col         a      b winsor_trunc_a
#>    <date>       <dbl>  <dbl>          <dbl>
#>  1 2021-01-01  1.04   0.450          1.04  
#>  2 2021-02-01  0.782  0.0634         0.782 
#>  3 2021-03-01 -0.775  0.874         -0.775 
#>  4 2021-04-01  0.775  0.159          0.775 
#>  5 2021-05-01  0.0279 0.545          0.0279
#>  6 2021-06-01  0.365  0.828          0.365 
#>  7 2021-07-01 -0.695  0.747         -0.695 
#>  8 2021-08-01  1.99   0.515          1.57  
#>  9 2021-09-01  0.298  0.129          0.298 
#> 10 2021-10-01  0.455  0.806          0.455 
#> # ℹ 14 more rows