`data(mtcars)`

# Introduction

Bootstrap resampling is a powerful technique used in statistics and data analysis to estimate the uncertainty of a statistic by repeatedly sampling from the original data. In R, we can easily implement a bootstrap function using the lapply, rep, and sample functions. In this blog post, we will explore how to write a bootstrap function in R and provide an example using the “mpg” column from the popular “mtcars” dataset.

# Bootstrap Function Implementation

To create a bootstrap function in R, we can follow these steps:

## Step 1: Load the required dataset

Let’s begin by loading the “mtcars” dataset, which is included in the base R package:

## Step 2: Define the bootstrap function

We’ll define a function called `bootstrap()`

that takes two arguments: `data`

(the input data vector) and `n`

(the number of bootstrap iterations).

```
<- function(data, n) {
bootstrap <- lapply(1:n, function(i) {
resampled_data <- sample(data, replace = TRUE)
resample # Perform desired operations on the resampled data, e.g., compute a statistic
# and return the result
})return(resampled_data)
}
<- bootstrap(mtcars$mpg, 5)
bootstrapped_samples bootstrapped_samples
```

```
[[1]]
[1] 21.0 18.1 33.9 21.4 17.3 19.2 19.2 15.8 16.4 30.4 18.1 14.3 32.4 10.4 15.0
[16] 16.4 30.4 17.8 21.4 19.2 17.3 22.8 14.3 22.8 30.4 18.7 13.3 13.3 15.2 10.4
[31] 15.0 13.3
[[2]]
[1] 18.7 32.4 21.0 10.4 15.0 14.7 24.4 10.4 32.4 10.4 21.0 19.7 21.4 10.4 30.4
[16] 17.3 10.4 22.8 15.2 15.2 21.4 15.8 21.4 33.9 24.4 15.2 18.1 19.2 21.0 24.4
[31] 15.5 21.0
[[3]]
[1] 15.5 30.4 21.0 22.8 27.3 18.1 21.0 13.3 15.2 17.3 15.8 21.0 18.1 14.3 17.8
[16] 15.8 21.0 18.1 19.2 24.4 19.2 22.8 18.7 14.3 26.0 21.4 22.8 32.4 14.7 15.2
[31] 15.2 14.3
[[4]]
[1] 13.3 21.0 13.3 15.0 19.2 18.1 18.1 19.2 22.8 18.7 26.0 21.4 14.7 14.3 17.8
[16] 22.8 19.7 21.4 30.4 30.4 18.7 17.3 16.4 21.5 18.1 21.0 17.8 21.4 14.3 19.7
[31] 32.4 18.7
[[5]]
[1] 15.0 21.4 21.5 26.0 17.3 30.4 18.1 17.8 17.3 30.4 24.4 32.4 21.0 17.8 33.9
[16] 32.4 19.2 22.8 19.7 16.4 17.8 22.8 14.3 33.9 21.5 10.4 21.4 26.0 33.9 14.7
[31] 21.5 18.1
```

In the above code, we use `lapply`

to generate a list of `n`

resampled datasets. Inside the `lapply`

function, we use the `sample`

function to randomly sample from the original data with replacement (`replace = TRUE`

). This ensures that each resampled dataset has the same length as the original dataset.

## Step 3: Perform desired operations on resampled data

Within the `lapply`

function, you can perform any desired operations on the resampled data. This could involve calculating statistics, fitting models, or conducting hypothesis tests. Customize the code within the `lapply`

function to suit your specific needs.

Example: Bootstrapping the “mpg” column in mtcars: Let’s illustrate the usage of our bootstrap function by resampling the “mpg” column from the “mtcars” dataset. We will calculate the mean of the resampled datasets.

```
# Step 1: Load the dataset
data(mtcars)
# Step 2: Define the bootstrap function
<- function(data, n) {
bootstrap <- lapply(1:n, function(i) {
resampled_data <- sample(data, replace = TRUE)
resample mean(resample) # Calculate the mean of each resampled dataset
})return(resampled_data)
}
# Step 3: Perform the bootstrap resampling
<- bootstrap(mtcars$mpg, n = 1000)
bootstrapped_means
# Display the first few resampled means
head(bootstrapped_means)
```

```
[[1]]
[1] 20.21562
[[2]]
[1] 20.09375
[[3]]
[1] 19.59375
[[4]]
[1] 20.13437
[[5]]
[1] 21.17813
[[6]]
[1] 21.5375
```

In the above example, we resample the “mpg” column of the “mtcars” dataset 1000 times. The `bootstrap()`

function calculates the mean of each resampled dataset and returns a list of resampled means. The `head()`

function is then used to display the first few resampled means.

Of course we do not have to specify a statistic function in the bootstrap, we can choose to just return bootstrap samples and then perform some sort of statistic on it. Look at the following example using the above `bootstrapped_samples`

data.

```
quantile(unlist(bootstrapped_samples),
probs = c(0.025, 0.25, 0.5, 0.75, 0.975))
```

```
2.5% 25% 50% 75% 97.5%
10.400 15.725 19.200 22.800 33.900
```

`mean(unlist(bootstrapped_samples))`

`[1] 20.06625`

`sd(unlist(bootstrapped_samples))`

`[1] 5.827239`

# Conclusion

In this blog post, we have learned how to write a bootstrap function in R using the `lapply`

and `sample`

functions. By employing these functions, we can easily generate resampled datasets to estimate the uncertainty of statistics or perform other desired operations. The example using the “mpg” column of the “mtcars” dataset demonstrated the usage of the bootstrap function to calculate resampled means. Feel free to customize the function to suit your specific needs and explore the power of bootstrap resampling in R.