# An Overview of the New Parameter Estimate Functions in the TidyDensity Package

code
rtip
tidydensity
Author

Steven P. Sanderson II, MPH

Published

June 3, 2024

# Introduction

Hello, R enthusiasts! I’m excited to share some fantastic updates to the TidyDensity package. These updates introduce a suite of parameter estimate functions designed to make your data analysis more efficient and insightful. Whether you’re dealing with common distributions or more specialized ones, these functions have got you covered.

# Why Parameter Estimation?

Parameter estimation is crucial when working with statistical distributions. It allows you to infer the parameters of a distribution from your data, providing insights into its underlying structure. This is particularly useful when you want to model real-world phenomena accurately.

# New Parameter Estimate Functions

Here’s a quick rundown of the newly introduced functions in TidyDensity:

1. util_zero_truncated_negative_binomial_param_estimate()
2. util_zero_truncated_poisson_param_estimate()
3. util_f_param_estimate()
4. util_zero_truncated_geometric_param_estimate()
5. util_t_param_estimate()
6. util_pareto1_param_estimate()
7. util_paralogistic_param_estimate()
8. util_inverse_weibull_param_estimate()
9. util_inverse_pareto_param_estimate()
10. util_inverse_burr_param_estimate()
11. util_generalized_pareto_param_estimate()
12. util_generalized_beta_param_estimate()
13. util_zero_truncated_binomial_param_estimate()

Each function is tailored to a specific distribution, providing a streamlined way to estimate its parameters.

# Example: Estimating Parameters of a t Distribution

Let’s dive into an example using the util_t_param_estimate() function. Suppose you have data that you believe follows a t distribution. Here’s how you can estimate its parameters:

library(dplyr)
library(ggplot2)
library(TidyDensity)

set.seed(123)
x <- rt(100, df = 10, ncp = 0.5)
output <- util_t_param_estimate(x)

# Display the estimated parameters
print(output\$parameter_tbl)
# A tibble: 2 × 7
dist_type      samp_size  mean variance method df_est ncp_est
<chr>              <int> <dbl>    <dbl> <chr>   <dbl>   <dbl>
1 T Distribution       100 0.612    0.949 MME     0.959   0.612
2 T Distribution       100 0.612    0.949 MLE     8.32    0.571

In this example, we generated some data from a t distribution with degrees of freedom (df) of 10 and a non-centrality parameter (ncp) of 0.5. Using the util_t_param_estimate() function, we estimated these parameters from the data.

The parameter_tbl in the output contains the estimated values, while combined_data_tbl can be used for visualization.

## Visualizing the Results

Here’s what the output might look like:

# Visualize the combined data
output\$combined_data_tbl |>
tidy_combined_autoplot(.interactive = TRUE)

In the above plot, we visualize the output of the util_t_param_estimate() function from the TidyDensity package. The visualization shows how well the estimated t distribution fits our sample data. The x-axis represents the data values, while the y-axis shows the density. The different colors represent the data and the estimated density functions.

# How to Use the New Functions

Each of the new parameter estimate functions follows a similar approach. Here’s a step-by-step guide to get you started:

2. Select the appropriate function: Choose the function that matches the distribution you believe your data follows.
3. Estimate the parameters: Use the selected function to estimate the parameters.
4. Analyze and visualize: Review the estimated parameters and use the visualization functions to see how well the estimated distribution fits your data.