# Testing stationarity with the ts_adf_test() function in R

rtip
healthyrts
timeseries
Author

Steven P. Sanderson II, MPH

Published

October 17, 2023

# Introduction

Hey there, R enthusiasts! Today, we’re going to dive into the fascinating world of time series analysis using the `ts_adf_test()` function from the `healthyR.ts` R library. If you’re into data, statistics, and R coding, this is a must-know tool for your arsenal.

# What’s the Deal with Augmented Dickey-Fuller?

Before we delve into the `ts_adf_test()` function, let’s understand the concept behind it. The Augmented Dickey-Fuller (ADF) test is a crucial tool in time series analysis. It’s like the Sherlock Holmes of time series data, helping us detect whether a series is stationary or not. Stationarity is a fundamental assumption in time series modeling because many models work best when applied to stationary data.

So, why “Augmented”? Well, it’s an extension of the original Dickey-Fuller test that accounts for more complex relationships within the time series data.

# The `ts_adf_test()` Function

Now, let’s get to the star of the show, the `ts_adf_test()` function. This function is part of the `healthyR.ts` library, and its primary job is to perform the ADF test on a given time series. In R, a time series can be represented as a numeric vector. Here’s the basic syntax:

``ts_adf_test(.x, .k = NULL)``
• `.x` is your time series data, the numeric vector you want to analyze.
• `.k` is an optional parameter that allows you to specify the lag order. If you leave it empty (like `.k = NULL`), don’t worry; the function will calculate it for you based on the number of observations using a clever formula.

# Show Me the Stats!

So, what does `ts_adf_test()` return? It gives you a list object containing two vital pieces of information:

1. Test Statistic: This is the heart of the ADF test. It tells us how strongly our data deviates from being stationary. A more negative value indicates stronger evidence for stationarity.

2. P-Value: This is another critical number. It represents the probability that you’d observe a test statistic as extreme as the one you obtained if the data were not stationary. In simpler terms, a low p-value suggests that your data is likely stationary, while a high p-value implies non-stationarity.

# Let’s Get Practical

Enough theory! Let’s see some action with a couple of examples. Say we have the `AirPassengers` and `BJsales` datasets, and we want to check their stationarity:

``````library(healthyR.ts)

``AirPassengers ADF Test Result:``
``print(result_air)``
``````\$test_stat
[1] -7.318571

\$p_value
[1] 0.01``````
``````# ADF test for BJsales
``````
``print(result_bj)``
``````\$test_stat
[1] -2.110919

\$p_value
[1] 0.5301832``````

In the `AirPassengers` example, we get a test statistic of -7.318571 and a p-value of 0.01. This suggests strong evidence for stationarity in this dataset.

However, for `BJsales`, we get a test statistic of -2.110919 and a p-value of 0.5301832. The higher p-value here indicates that the data is less likely to be stationary.

Now let’s see what happens when we change the lags of the series by one period.

``ts_adf_test(AirPassengers, 1)``
``````\$test_stat
[1] -7.652287

\$p_value
[1] 0.01``````
``ts_adf_test(BJsales, 1)``
``````\$test_stat
[1] -1.316414

\$p_value
[1] 0.8611925``````

# Conclusion

The `ts_adf_test()` function in the `healthyR.ts` library is a valuable tool for any data scientist or R coder working with time series data. It helps you determine whether your data is stationary, a crucial step in building reliable time series models.

So, the next time you’re faced with a time series dataset, remember to call on your trusty companion, `ts_adf_test()`, to solve the mystery of stationarity. Happy coding, R enthusiasts!