Gives the optimal binwidth for a histogram given a data set, it's value and the desired amount of bins
Arguments
- .data
The data set in question
- .value_col
The column that holds the values
- .iters
How many times the cost function loop should run
Details
Modified from Hideaki Shimazaki Department of Physics, Kyoto University shimazaki at ton.scphys.kyoto-u.ac.jp Feel free to modify/distribute this program.
Supply a data.frame/tibble with a value column. from this an optimal binwidth will be computed for the amount of binds desired
Examples
suppressPackageStartupMessages(library(purrr))
suppressPackageStartupMessages(library(dplyr))
df_tbl <- rnorm(n = 1000, mean = 0, sd = 1)
df_tbl <- df_tbl %>%
as_tibble() %>%
set_names("value")
df_tbl %>%
opt_bin(
.value_col = value
, .iters = 100
)
#> # A tibble: 11 × 1
#> value
#> <dbl>
#> 1 -4.04
#> 2 -3.29
#> 3 -2.55
#> 4 -1.80
#> 5 -1.05
#> 6 -0.306
#> 7 0.441
#> 8 1.19
#> 9 1.94
#> 10 2.68
#> 11 3.43