# Creating Summary Tables in R with tidyquant and dplyr

code
rtip
operations
Author

Steven P. Sanderson II, MPH

Published

July 26, 2024

# Introduction

Creating summary tables is a key part of data analysis, allowing you to see trends and patterns in your data. In this post, we’ll explore how to create these tables using tidyquant and dplyr in R. These packages make it easy to manipulate and summarize your data.

# Examples

## Using tidyquant for Summary Tables

tidyquant is a versatile package that extends the tidyverse for financial and time series analysis. It simplifies working with data by integrating tidy principles.

### Example: Calculating Average Price by Month

Here’s an example of how to calculate the average price by month using tidyquant:

``````# Load necessary libraries
library(tidyquant)
library(dplyr)

# Sample data: Daily stock prices
data <- tibble(
date = seq(as.Date('2023-01-01'), as.Date('2023-06-30'), by = 'day'),
price = runif(181, 100, 200)
)

# Create a summary table with average closing price by month
summary_table <- data |>
mutate(month = floor_date(date, "month")) |>
pivot_table(
.rows = month,
.values = ~ mean(price, na.rm = TRUE)
) |>
setNames(c("date", "avg_price"))

print(summary_table)``````
``````# A tibble: 6 × 2
date       avg_price
<date>         <dbl>
1 2023-01-01      149.
2 2023-02-01      162.
3 2023-03-01      151.
4 2023-04-01      151.
5 2023-05-01      145.
6 2023-06-01      149.``````

In this example:

1. tidyquant and tibble are loaded to handle data manipulation.
2. We create a sample dataset with daily stock prices.
3. The `mutate` function adds a new column `month`, which extracts the month from each date.
4. `pivot_table` calculates the average price for each month.
5. Finally, we rename the columns for clarity.

## Using dplyr for Summary Tables

dplyr is a core tidyverse package known for its powerful data manipulation functions. It helps streamline the process of filtering, summarizing, and mutating data.

### Example: Calculating Average Closing Price by Month

Here’s a similar example using dplyr:

``````# Load necessary libraries
library(dplyr)
library(lubridate)

# Sample data: Daily stock prices
data <- tibble(
date = seq(as.Date('2023-01-01'), as.Date('2023-06-30'), by = 'day'),
price = runif(181, 100, 200)
)

# Create a summary table with average closing price by month
summary_table <- data %>%
mutate(month = floor_date(date, "month")) %>%
group_by(month) %>%
summarise(avg_close = mean(price))

print(summary_table)``````
``````# A tibble: 6 × 2
month      avg_close
<date>         <dbl>
1 2023-01-01      149.
2 2023-02-01      140.
3 2023-03-01      147.
4 2023-04-01      146.
5 2023-05-01      147.
6 2023-06-01      151.``````

In this dplyr example:

1. We load dplyr and lubridate for data manipulation and date handling.
2. The dataset creation process is the same.
3. The `mutate` function is used to add a `month` column.
4. We group the data by month using `group_by` and then calculate the average closing price for each group using `summarise`.