# Exploring the Peaks: A Dive into the Triangular Distribution in TidyDensity

code
tidydensity
Author

Steven P. Sanderson II, MPH

Published

January 10, 2024

# Introduction

Welcome back, fellow data enthusiasts! Today, we embark on an exciting journey into the world of statistical distributions with a special focus on the latest addition to the TidyDensity package – the triangular distribution. Tightly packed and versatile, this distribution brings a unique flavor to your data simulations and analyses. In this blog post, we’ll delve into the functions provided, understand their arguments, and explore the wonders of the triangular distribution.

# What’s So Special About Triangular Distributions?

• Flexibility in uncertainty: They model situations where you have a minimum, maximum, and most likely value, but the exact distribution between those points is unknown.
• Common in real-world scenarios: Project cost estimates, task completion times, expert opinions, and even natural phenomena often exhibit triangular patterns.
• Simple to understand and visualize: Their straightforward shape makes them accessible for interpretation and communication.

The triangular distribution is a continuous probability distribution with lower limit a, upper limit b, and mode c, where a < b and a ≤ c ≤ b. The distribution resembles a tent shape.

The probability density function of the triangular distribution is:

``````f(x) =
(2(x - a)) / ((b - a)(c - a))  for a ≤ x ≤ c
(2(b - x)) / ((b - a)(b - c))  for c ≤ x ≤ b``````

The key parameters of the triangular distribution are:

• `a` - the minimum value
• `b` - the maximum value
• `c` - the mode (most frequent value)

The triangular distribution is often used as a subjective description of a population for which there is only limited sample data. It is useful when a process has a natural minimum and maximum.

# Triangular Functions

TidyDensity’s Triangular Distribution Functions: Let’s start by introducing the main functions for the triangular distribution:

1. `tidy_triangular()`: This function generates a triangular distribution with a specified number of simulations, minimum, maximum, and mode values.
• .n: Specifies the number of x values for each simulation.
• .min: Sets the minimum value of the triangular distribution.
• .max: Determines the maximum value of the triangular distribution.
• .mode: Specifies the mode (peak) value of the triangular distribution.
• .num_sims: Controls the number of simulations to perform.
• .return_tibble: A logical value indicating whether to return the result as a tibble.
2. `util_triangular_param_estimate()`: This function estimates the parameters of a triangular distribution from a tidy data frame.
• .x: Requires a numeric vector, with all values satisfying 0 <= x <= 1.
• .auto_gen_empirical: A boolean value (TRUE/FALSE) with a default set to TRUE. It automatically generates tidy_empirical() output for the .x parameter and utilizes tidy_combine_distributions().
3. `util_triangular_stats_tbl()`: This function creates a tidy data frame with statistics for a triangular distribution.
• .data: The data being passed from a tidy_ distribution function.
4. `triangle_plot()`: This function creates a ggplot2 object for a triangular distribution.
• .data: Tidy data from the tidy_triangular function.
• .interactive: A logical value indicating whether to return an interactive plot using plotly. Default is FALSE.

## Using tidy_triangular for Simulations

Suppose you want to simulate a triangular distribution with 100 x values, a minimum of 0, a maximum of 1, and a mode at 0.5. You’d use the following code:

``````library(TidyDensity)

triangular_data <- tidy_triangular(
.n = 100,
.min = 0,
.max = 1,
.mode = 0.5,
.num_sims = 1,
.return_tibble = TRUE
)

triangular_data``````
``````# A tibble: 100 × 7
sim_number     x     y      dx      dy     p     q
<fct>      <int> <dbl>   <dbl>   <dbl> <dbl> <dbl>
1 1              1 0.853 -0.140  0.00158 0.957 0.853
2 1              2 0.697 -0.128  0.00282 0.816 0.697
3 1              3 0.656 -0.116  0.00484 0.764 0.656
4 1              4 0.518 -0.103  0.00805 0.536 0.518
5 1              5 0.635 -0.0909 0.0130  0.733 0.635
6 1              6 0.838 -0.0786 0.0202  0.948 0.838
7 1              7 0.645 -0.0662 0.0304  0.748 0.645
8 1              8 0.482 -0.0539 0.0444  0.464 0.482
9 1              9 0.467 -0.0416 0.0627  0.437 0.467
10 1             10 0.599 -0.0293 0.0859  0.678 0.599
# ℹ 90 more rows``````

This generates a tidy tibble with simulated data, ready for your analysis.

## Estimating Parameters and Creating Stats Tables

Utilize the `util_triangular_param_estimate` function to estimate parameters and create tidy empirical data:

``````param_estimate <- util_triangular_param_estimate(.x = triangular_data\$y)

t(param_estimate\$parameter_tbl)``````
``````          [,1]
dist_type "Triangular"
samp_size "100"
min       "0.0572515"
max       "0.8822025"
mode      "0.8822025"
method    "Basic"     ``````

For statistics table creation:

``````stats_table <- util_triangular_stats_tbl(.data = triangular_data)
t(stats_table)``````
``````                  [,1]
tidy_function     "tidy_triangular"
function_call     "Triangular c(0, 1, 0.5)"
distribution      "Triangular"
distribution_type "continuous"
points            "100"
simulations       "1"
mean              "0.5"
median            "0.3535534"
mode              "1"
range_low         "0.0572515"
range_high        "0.8822025"
variance          "0.04166667"
skewness          "0"
kurtosis          "-0.6"
entropy           "-0.6931472"
computed_std_skew "-0.1870017"
computed_std_kurt "2.778385"
ci_lo             "0.08311609"
ci_hi             "0.8476985"              ``````

Visualizing the Triangular Distribution: Now, let’s visualize the triangular distribution using the `triangle_plot` function:

``triangle_plot(.data = triangular_data, .interactive = TRUE)``
``triangle_plot(.data = triangular_data, .interactive = FALSE)``

This will generate an informative plot, and if you set `.interactive` to TRUE, you can explore the distribution interactively using plotly.

# Conclusion

In this blog post, we’ve explored the powerful functionalities of the triangular distribution in TidyDensity. Whether you’re simulating data, estimating parameters, or creating insightful visualizations, these functions provide a robust toolkit for your statistical endeavors. Happy coding, and may your distributions always be tidy!