Key Update: TidyDensity 1.5.2 delivers significant performance improvements and enhanced mixture modeling capabilities, but introduces breaking changes.

TidyDensity 1.5.2 has arrived with substantial improvements that will transform how R programmers work with statistical distributions. Released on September 6, 2025, this update brings a fundamentally redesigned quantile_normalize() function and powerful new mixture modeling capabilities through enhanced tidy_mixture_density() functionality . While these changes offer compelling performance benefits, they also introduce breaking changes that require careful consideration for existing workflows.

What is TidyDensity?

TidyDensity is an R package designed to simplify the generation, analysis, and visualization of random numbers from statistical distributions within the tidyverse ecosystem. The package provides a consistent, tidy interface for working with distributional data, making it invaluable for simulation studies, statistical modeling, and exploratory data analysis.

Core capabilities include:

Generating tidy random samples from numerous distributions
Creating and analyzing mixture models
Performing quantile normalization for cross-sample comparability
Seamless integration with tidyverse workflows

Breaking Changes: The New `quantile_normalize()`

Performance Revolution

The most significant change in TidyDensity 1.5.2 is the complete redesign of the quantile_normalize() function. This algorithmic overhaul delivers substantial performance improvements.

Technical Implementation

The new algorithm leverages vectorized operations and indexing techniques, moving away from the classical approach that relied on memory-intensive intermediate storage. The redesign focuses on:

Reduced redundant sorting operations
In-place memory operations where possible
Optimized index mapping for restoring original order
Enhanced algorithmic efficiency for large datasets

Why Breaking Changes Occurred

The algorithmic improvements come with a trade-off: slightly different numerical outputs. While the statistical properties remain identical (same quantiles, same normalization effect), the exact element-wise values may differ between versions. The biggest difference is that the function now only returns the normalized data. The old one returned the input data, output data and other intermediate information like duplicate ranks rows.

library(TidyDensity)

Warning: package 'TidyDensity' was built under R version 4.5.1

# Example: Both versions produce identical quantiles
data <- matrix(rnorm(20), ncol = 4)
normalized_data <- quantile_normalize(data)
print(normalized_data)

           [,1]       [,2]       [,3]       [,4]
[1,] -0.9980322  0.8648786  0.8648786  1.2291761
[2,] -0.5656369 -0.5656369  1.2291761  0.8648786
[3,]  1.2291761  0.2846513  0.2846513  0.2846513
[4,]  0.2846513  1.2291761 -0.9980322 -0.9980322
[5,]  0.8648786 -0.9980322 -0.5656369 -0.5656369

# All columns now share identical quantile distributions
# But individual elements may differ slightly from v1.5.1
as.data.frame(normalized_data) |>
  sapply(function(x) quantile(x, probs = seq(0, 1, 1 / 4)))

             V1         V2         V3         V4
0%   -0.9980322 -0.9980322 -0.9980322 -0.9980322
25%  -0.5656369 -0.5656369 -0.5656369 -0.5656369
50%   0.2846513  0.2846513  0.2846513  0.2846513
75%   0.8648786  0.8648786  0.8648786  0.8648786
100%  1.2291761  1.2291761  1.2291761  1.2291761

# Return in tibble format
quantile_normalize(
data.frame(rand_norm_1 = rnorm(30),
           rand_norm_b = rnorm(30)),
           .return_tibble = TRUE)

# A tibble: 30 × 2
   rand_norm_1 rand_norm_b
         <dbl>       <dbl>
 1     -0.589       1.28  
 2     -1.17        1.04  
 3     -1.85       -0.0239
 4      0.899      -0.232 
 5      0.204       1.21  
 6      1.21       -0.796 
 7     -1.60        1.11  
 8      0.0248     -1.17  
 9     -1.22       -0.710 
10     -1.00        0.370 
# ℹ 20 more rows

Important: The quantile normalization properties are perfectly preserved - all columns have identical quantiles after processing. Only the specific element arrangements differ.

New Features: Enhanced `tidy_mixture_density()`

Flexible Combination Types

TidyDensity 1.5.2 introduces a powerful .combination_type parameter to tidy_mixture_density(), enabling five different ways to combine distributions :

Combination Type	Description	Use Case
stack	Concatenate all data points (default)	Traditional mixture models
add	Element-wise addition	Additive effects modeling
subtract	Element-wise subtraction	Difference analysis
multiply	Element-wise multiplication	Interaction effects
divide	Element-wise division	Ratio analysis

Practical Examples

# Traditional mixture model (default behavior)
mix_stack <- tidy_mixture_density(
  rnorm(100, 0, 1), 
  tidy_normal(.mean = 5, .sd = 1),
  .combination_type = "stack"
)
mix_stack

$data
$data$dist_tbl
# A tibble: 150 × 2
       x      y
   <int>  <dbl>
 1     1 -0.609
 2     2 -0.370
 3     3 -0.308
 4     4 -0.786
 5     5  0.437
 6     6 -0.552
 7     7  0.303
 8     8 -0.652
 9     9 -0.144
10    10 -0.260
# ℹ 140 more rows

$data$dens_tbl
# A tibble: 150 × 2
       x        y
   <dbl>    <dbl>
 1 -4.28 0.000118
 2 -4.19 0.000171
 3 -4.10 0.000245
 4 -4.01 0.000349
 5 -3.91 0.000489
 6 -3.82 0.000677
 7 -3.73 0.000931
 8 -3.63 0.00126 
 9 -3.54 0.00170 
10 -3.45 0.00226 
# ℹ 140 more rows

$data$input_data
$data$input_data$`rnorm(100, 0, 1)`
  [1] -0.60865878 -0.37038002 -0.30760339 -0.78599279  0.43657861 -0.55247080
  [7]  0.30320995 -0.65226084 -0.14429342 -0.25965742 -1.57960296  0.07128014
 [13]  0.65531402  1.35367040 -0.30645307 -0.82656469  1.32659588  0.36869342
 [19]  0.31268606  1.84046365 -1.35549208 -0.15825175  0.68863337 -1.39859775
 [25] -1.07112427  1.45502151  0.06602545 -0.39876615  0.05499137  0.09214760
 [31]  0.38800665 -1.04310666 -0.93508809  0.78018540 -0.14736187  0.48487063
 [37] -0.71797977 -0.09083663  0.24619862  0.42560605 -0.91303163 -0.40070704
 [43] -0.09056107  2.12683480  0.97909343  0.25586273  0.06160965 -0.24959411
 [49] -0.63688175  0.61513865 -1.80508425 -0.10904217 -1.49586272  0.65779129
 [55] -0.21556674  1.45041449  1.64820547 -0.00864845  1.14990888 -0.14165598
 [61]  1.08637758 -0.47666081  0.31451903  1.59206247 -0.31551530 -1.60855895
 [67]  0.91927450 -0.56171737 -0.17915531  0.25223463  0.99074046  1.09265035
 [73] -0.42699577 -1.42269492  0.28942361 -0.93808071 -0.38747430 -1.04629553
 [79] -0.93624539 -0.89624495 -0.94646613  1.43409772  0.40376921  2.20170782
 [85]  0.14770417 -1.10348135 -0.84095040  0.95636639  1.13483275 -0.43345698
 [91]  0.77418611  0.24623017 -0.49152719  0.97051886 -0.40725688  0.02543623
 [97] -0.16623957 -1.15500277  1.37865589  1.67954844

$data$input_data$`tidy_normal(.mean = 5, .sd = 1)`
# A tibble: 50 × 7
   sim_number     x     y    dx       dy      p     q
   <fct>      <int> <dbl> <dbl>    <dbl>  <dbl> <dbl>
 1 1              1  6.52  2.33 0.000974 0.936   6.52
 2 1              2  3.39  2.45 0.00261  0.0538  3.39
 3 1              3  5.54  2.57 0.00629  0.706   5.54
 4 1              4  5.40  2.69 0.0136   0.655   5.40
 5 1              5  5.64  2.81 0.0262   0.739   5.64
 6 1              6  5.40  2.93 0.0457   0.656   5.40
 7 1              7  5.02  3.05 0.0718   0.508   5.02
 8 1              8  4.34  3.17 0.102    0.254   4.34
 9 1              9  4.81  3.29 0.133    0.423   4.81
10 1             10  4.08  3.41 0.160    0.179   4.08
# ℹ 40 more rows



$plots
$plots$line_plot


$plots$dens_plot



$input_fns
[1] "rnorm(100, 0, 1), tidy_normal(.mean = 5, .sd = 1)"

# Additive mixture for modeling combined effects
mix_additive <- tidy_mixture_density(
  rnorm(50), 
  rbeta(50, 0.5, 0.5), 
  .combination_type = "add"
)
mix_additive

$data
$data$dist_tbl
# A tibble: 50 × 2
       x       y
   <int>   <dbl>
 1     1  0.172 
 2     2 -0.385 
 3     3  0.388 
 4     4  0.168 
 5     5 -1.48  
 6     6  0.743 
 7     7  0.859 
 8     8  0.783 
 9     9 -0.0703
10    10 -0.992 
# ℹ 40 more rows

$data$dens_tbl
# A tibble: 50 × 2
       x        y
   <dbl>    <dbl>
 1 -2.41 0.000289
 2 -2.28 0.000893
 3 -2.16 0.00236 
 4 -2.04 0.00531 
 5 -1.91 0.0103  
 6 -1.79 0.0174  
 7 -1.66 0.0259  
 8 -1.54 0.0353  
 9 -1.41 0.0461  
10 -1.29 0.0597  
# ℹ 40 more rows

$data$input_data
$data$input_data$`rnorm(50)`
 [1] -0.28375667 -1.35693228 -0.44459842 -0.44189941 -1.75245474  0.56591611
 [7]  0.64969767  0.43861900 -0.29576390 -1.05083449  0.46476746 -0.09441544
[13]  0.01988715 -0.11553959 -1.06165613 -0.23215249 -0.73166009  0.85851074
[19] -0.23347946 -0.98707523  0.48980891  0.45443754 -0.06019617 -0.28090697
[25]  0.02640269  0.34780762  0.08271394  0.38223602  0.37200374  0.15833057
[31]  0.67451345  0.19271746 -0.76646273 -0.61174894 -0.66437076  0.41119339
[37]  0.94342842  1.79174540 -0.78712893  0.84426079  1.21105485 -1.08434366
[43]  0.34320348  1.51119066 -1.54429610 -0.53518346 -0.12958712  0.40503043
[49]  1.10792452 -0.35614745

$data$input_data$`rbeta(50, 0.5, 0.5)`
 [1] 0.4558286432 0.9722089635 0.8326968294 0.6098241262 0.2693505903
 [6] 0.1769712879 0.2097711862 0.3447365735 0.2255041322 0.0593279079
[11] 0.3605619162 0.7653921143 0.9300206371 0.0649047973 0.8248588302
[16] 0.4746486057 0.0007107969 0.0303139028 0.1092924293 0.9994465190
[21] 0.5447640613 0.6838621697 0.4264545201 0.0350483756 0.1439936791
[26] 0.9999991418 0.9971223513 0.9366023326 0.2380888988 0.3954270399
[31] 0.6355559970 0.0082225336 0.1935222058 0.8301693526 0.0006154042
[36] 0.9468539743 0.5805084634 0.9630710788 0.8536182424 0.0636560268
[41] 0.3383464734 0.8648131947 0.2472292868 0.7353653812 0.6462800353
[46] 0.3418460528 0.8706317638 0.0537689028 0.4028589675 0.5659417332



$plots
$plots$line_plot


$plots$dens_plot



$input_fns
[1] "rnorm(50), rbeta(50, 0.5, 0.5)"

# Multiplicative interactions
mix_multiplicative <- tidy_mixture_density(
  rnorm(50), 
  rbeta(50, 0.5, 0.5), 
  .combination_type = "multiply"
)
mix_multiplicative

$data
$data$dist_tbl
# A tibble: 50 × 2
       x       y
   <int>   <dbl>
 1     1  0.0128
 2     2 -0.352 
 3     3 -0.228 
 4     4  0.0318
 5     5  0.0846
 6     6  0.0148
 7     7  0.301 
 8     8  0.125 
 9     9 -0.196 
10    10  1.58  
# ℹ 40 more rows

$data$dens_tbl
# A tibble: 50 × 2
        x       y
    <dbl>   <dbl>
 1 -1.49  0.00113
 2 -1.42  0.00577
 3 -1.34  0.0217 
 4 -1.27  0.0605 
 5 -1.19  0.127  
 6 -1.12  0.205  
 7 -1.04  0.261  
 8 -0.968 0.272  
 9 -0.893 0.243  
10 -0.818 0.197  
# ℹ 40 more rows

$data$input_data
$data$input_data$`rnorm(50)`
 [1]  0.32370511 -1.21400213 -1.20899806  0.03519770  0.53932734  0.02665457
 [7]  0.41845197  0.14043657 -0.55713865  1.68772225  0.33995465 -1.38935947
[13]  0.74516592 -0.04187743 -1.86673573  0.14417007 -0.44909386 -0.46806361
[19] -1.36766494  0.19793102  0.85812595  0.38882190 -1.00188631  0.43322473
[25]  0.43850846  0.84118862  1.63111722 -0.80401369 -0.96695329  0.44273011
[31]  0.18966768  0.18008685  0.47594963  2.67993093  0.56240726 -0.68272322
[37]  2.07827084  2.87539786  0.07352364 -0.59474395  0.54737811  0.70341946
[43] -0.46216676 -1.04391929 -0.53857891  0.60391106 -1.06072413  0.36132956
[49] -1.24601342 -0.67495126

$data$input_data$`rbeta(50, 0.5, 0.5)`
 [1] 0.03961826 0.28965374 0.18885811 0.90263774 0.15690042 0.55506535
 [7] 0.71961247 0.88995670 0.35266007 0.93680182 0.79821917 0.79099196
[13] 0.47833454 0.88224189 0.06153885 0.96803949 0.16677721 0.08155236
[19] 0.78180736 0.89987410 0.06377253 0.99049015 0.92083908 0.81639438
[25] 0.11599055 0.89253034 0.99961689 0.06746365 0.94993344 0.93847486
[31] 0.04346734 0.97125746 0.01042851 0.07114067 0.04258062 0.31699967
[37] 0.57796230 0.62059110 0.28030112 0.95068023 0.21437376 0.46588256
[43] 0.60081481 0.98127380 0.03041642 0.67000409 0.72883674 0.26381406
[49] 0.43238751 0.07012001



$plots
$plots$line_plot


$plots$dens_plot



$input_fns
[1] "rnorm(50), rbeta(50, 0.5, 0.5)"

# Subtration for differencing
mix_subtract <- tidy_mixture_density(
        rnorm(50),
        rbeta(50, 0.5, 0.5),
        .combination_type = "subtract"
)
mix_subtract

$data
$data$dist_tbl
# A tibble: 50 × 2
       x       y
   <int>   <dbl>
 1     1 -0.934 
 2     2  0.0255
 3     3 -0.807 
 4     4  0.639 
 5     5 -0.151 
 6     6  0.261 
 7     7  0.383 
 8     8 -1.06  
 9     9 -1.17  
10    10  0.868 
# ℹ 40 more rows

$data$dens_tbl
# A tibble: 50 × 2
       x        y
   <dbl>    <dbl>
 1 -3.33 0.000267
 2 -3.22 0.000687
 3 -3.10 0.00159 
 4 -2.99 0.00333 
 5 -2.88 0.00633 
 6 -2.76 0.0109  
 7 -2.65 0.0173  
 8 -2.54 0.0252  
 9 -2.43 0.0342  
10 -2.31 0.0442  
# ℹ 40 more rows

$data$input_data
$data$input_data$`rnorm(50)`
 [1]  0.04741171  0.78106971 -0.51649345  0.79333600  0.84698156  0.44755311
 [7]  0.42599173 -0.05861854 -0.17167764  1.86283568 -0.34825311  1.15277765
[13] -0.43388248 -0.44234758  0.18575088 -1.14766103 -1.00124274 -1.29743634
[19]  0.04227262  1.88358997 -1.03996542  0.01229659  0.54674453  0.78417878
[25] -0.68734596  1.46836234  1.17552232  0.21217058  0.58419970  1.79239641
[31] -0.15530648  0.77885429  1.54672370  1.11665693  0.35566983  0.52467994
[37]  0.30117165 -0.38017897 -0.35182655 -0.50842405  1.54094057  0.01395280
[43] -0.56581282 -0.36566571  0.98543508 -0.48095752 -0.08275619 -0.53918661
[49] -0.51094105  0.65497036

$data$input_data$`rbeta(50, 0.5, 0.5)`
 [1] 0.981019342 0.755544020 0.290997064 0.153894509 0.997501146 0.187018584
 [7] 0.042874119 0.997810266 0.998535547 0.994524920 0.049740075 0.053181972
[13] 0.009592371 0.127241021 0.385442338 0.141428843 0.966020263 0.998885522
[19] 0.194088533 0.709788033 0.479590987 0.346260596 0.958567049 0.796353667
[25] 0.928775994 0.981901862 0.241413100 0.061032775 0.789884132 0.895207272
[31] 0.011005119 0.931750374 0.761540209 0.632937614 0.959700133 0.065693893
[37] 0.654884437 0.980215158 0.887362439 0.347511106 0.982633970 0.108274936
[43] 0.898851715 0.602077489 0.104709882 0.286286952 0.181128702 0.670655445
[49] 0.287199855 0.910472770



$plots
$plots$line_plot


$plots$dens_plot



$input_fns
[1] "rnorm(50), rbeta(50, 0.5, 0.5)"

# Division for ratios
mix_divide <- tidy_mixture_density(
        rnorm(50),
        rbeta(50, 0.5, 0.5),
        .combination_type = "divide"
)
mix_divide

$data
$data$dist_tbl
# A tibble: 50 × 2
       x         y
   <int>     <dbl>
 1     1     3.00 
 2     2    -1.79 
 3     3     0.603
 4     4     1.36 
 5     5 25721.   
 6     6  9101.   
 7     7     1.80 
 8     8    -0.620
 9     9    -0.698
10    10     1.73 
# ℹ 40 more rows

$data$dens_tbl
# A tibble: 50 × 2
        x        y
    <dbl>    <dbl>
 1 -445.  6.46e- 3
 2   89.1 5.56e- 4
 3  623.  2.49e-20
 4 1157.  3.46e-20
 5 1691.  5.86e-19
 6 2225.  1.04e-19
 7 2759.  1.06e-18
 8 3293.  0       
 9 3827.  0       
10 4362.  0       
# ℹ 40 more rows

$data$input_data
$data$input_data$`rnorm(50)`
 [1]  1.14765189 -0.65037474  0.56076429  0.14529634  2.07643060  1.35312739
 [7]  1.36098704 -0.38382420 -0.60398679  1.72618260  1.02326275  0.53904945
[13]  0.58427997 -1.24784343  0.54468193  0.23675504 -0.06278437 -0.19938179
[19]  0.29774594 -1.73700726 -0.95278639  1.50377661  0.76641470 -1.64577160
[25] -1.87538645  0.20415661 -0.02288698  0.03017884 -1.11147919 -0.53853737
[31]  0.58745346  1.53857208 -0.71316156  0.30820280  1.12513966 -1.22796997
[37] -0.43473722 -1.17160252 -1.49085069 -0.97810140  0.77343726 -0.27780074
[43]  0.10606935  1.18324993 -0.66469916  0.77692003 -1.72619510  0.12750687
[49] -0.63319646 -0.44339251

$data$input_data$`rbeta(50, 0.5, 0.5)`
 [1] 3.820203e-01 3.628845e-01 9.300254e-01 1.067735e-01 8.072951e-05
 [6] 1.486745e-04 7.576313e-01 6.194406e-01 8.654029e-01 9.957012e-01
[11] 4.249757e-01 1.018361e-01 9.576092e-01 9.578569e-03 1.354495e-03
[16] 9.547469e-01 9.999004e-01 6.847354e-03 2.532576e-01 8.101000e-01
[21] 9.979821e-01 7.362087e-01 1.718800e-01 1.436152e-01 7.798669e-01
[26] 9.892079e-01 5.585185e-01 8.395558e-01 2.514851e-03 6.805656e-01
[31] 9.669103e-01 1.653675e-04 7.284081e-01 9.800184e-01 9.538534e-02
[36] 8.859544e-01 9.990685e-01 1.933901e-01 9.998451e-01 5.631759e-01
[41] 4.015837e-02 3.758858e-01 9.970344e-01 6.033153e-01 5.793981e-01
[46] 2.345981e-03 9.560933e-01 7.753198e-01 2.004754e-01 7.424075e-01



$plots
$plots$line_plot


$plots$dens_plot



$input_fns
[1] "rnorm(50), rbeta(50, 0.5, 0.5)"

Each combination returns comprehensive output including:

Tidy data tables with combined distributions
Density estimates for visualization
Ready-to-use plots for immediate analysis
Input function metadata for reproducibility

Best Practices and Recommendations

For Existing Users

Gradually migrate critical workflows after thorough testing
Document any code that depends on exact quantile_normalize() outputs
Leverage new mixture modeling for more sophisticated statistical modeling
Test downstream analyses to ensure compatibility

For New Users

Start with v1.5.2 to benefit from performance improvements immediately
Explore mixture modeling capabilities for creative statistical applications
Use in tidyverse pipelines for seamless data science workflows

Looking Forward

TidyDensity 1.5.2 represents a significant evolution in the package’s capabilities. The performance improvements in quantile_normalize() make it more suitable for large-scale data science applications, while the enhanced tidy_mixture_density() opens new possibilities for sophisticated statistical modeling.

The breaking changes, though initially challenging, position the package for better scalability and more efficient memory usage, crucial factors for modern data science workflows.

Conclusion

TidyDensity 1.5.2 delivers substantial improvements that will benefit R programmers working with statistical distributions. The 48.6% performance improvement in quantile_normalize() and flexible mixture modeling capabilities make this update highly valuable, despite the breaking changes.

Key takeaways:

✅ Significant performance gains across all dataset sizes
✅ Enhanced mixture modeling with five combination types
✅ Preserved statistical properties in quantile normalization
⚠️ Breaking changes require testing of existing workflows
🚀 Improved scalability for large-scale data science applications

Ready to upgrade? Update to TidyDensity 1.5.2 and test your critical workflows to ensure compatibility. The performance benefits and new capabilities make this update well worth the migration effort.