Taming Excel Dates in R: From Numbers to Meaningful Dates!

code
rtip
timeseries
Author

Steven P. Sanderson II, MPH

Published

February 5, 2024

Introduction

Have you ever battled with Excel’s quirky date formats in your R projects? If so, you’re not alone! Those cryptic numbers can be a real headache, but fear not, fellow R warriors! Today, we’ll conquer this challenge and transform those numbers into beautiful, usable dates.

Our Mission: We’ll convert two date columns in a tibble named “df”:

1. `date`: Stored as numbers, representing days since some mysterious date.
2. `datetime`: Also in numberland, but with an additional decimal for time.

Our Weapons:

• `as.Date()`: This built-in R function is our date-conversion hero, but we need to give it a secret weapon: `origin = "1899-12-30"`. This tells `as.Date()` where the Excel date system starts counting days from.
• `openxlsx` library: This package helps us deal with Excel files. We’ll use its `convertToDateTime()` function to handle the `datetime` column, which includes both date and time information.

Let’s Code!

``````# Install and load the openxlsx library (if needed)
if (!require(openxlsx)) install.packages("openxlsx")
library(openxlsx)

# Our example data
df <- data.frame(
date = c(44563, 44566, 44635, 44670, 44706, 44716, 44761, 44782, 44864, 44919),
datetime = c(44563.17, 44566.51, 44635.64, 44670.40,
44706.43, 44716.42, 44761.05, 44782.09,
44864.19, 44919.89),
sales = c(14, 19, 22, 29, 24, 25, 25, 30, 35, 28)
)

df``````
``````    date datetime sales
1  44563 44563.17    14
2  44566 44566.51    19
3  44635 44635.64    22
4  44670 44670.40    29
5  44706 44706.43    24
6  44716 44716.42    25
7  44761 44761.05    25
8  44782 44782.09    30
9  44864 44864.19    35
10 44919 44919.89    28``````
``````# Convert "date" column using as.Date() and the magic origin
df\$date <- as.Date(df\$date, origin = "1899-12-30")

# Convert "datetime" column using openxlsx and convertToDateTime()
df\$datetime <- convertToDateTime(df\$datetime)``````

Breaking it Down

1. The first line checks if `openxlsx` is installed and loads it if needed.
2. We create our sample data frame `df` with the `date` and `datetime` columns.
3. The magic happens! We use `as.Date()` on `df\$date`, specifying the `origin` as “1899-12-30”. This tells R to interpret the numbers as days since that date.
4. For `df\$datetime`, we use `convertToDateTime()` from the `openxlsx` package. This function handles both date and time information stored as decimals.

Voila! Our `df` now has proper date and datetime columns, ready for further analysis and visualization. Let’s see the results:

``head(df, 1)``
``````        date            datetime sales
1 2022-01-02 2022-01-02 04:04:48    14``````

You’re Turn!

Now it’s your turn! Grab your own Excel data with mysterious date formats and try this code. Play with different `origin` values if needed (depending on your Excel version). Remember, R is a playground, so have fun exploring and taming those dates!

Bonus Tip: Want to format your dates for readability? Use the `format()` function, like this:

``````df\$date <- format(df\$date, "%d/%m/%Y")
df``````
``````         date            datetime sales
1  02/01/2022 2022-01-02 04:04:48    14
2  05/01/2022 2022-01-05 12:14:24    19
3  15/03/2022 2022-03-15 15:21:36    22
4  19/04/2022 2022-04-19 09:36:00    29
5  25/05/2022 2022-05-25 10:19:12    24
6  04/06/2022 2022-06-04 10:04:48    25
7  19/07/2022 2022-07-19 01:12:00    25
8  09/08/2022 2022-08-09 02:09:36    30
9  30/10/2022 2022-10-30 04:33:36    35
10 24/12/2022 2022-12-24 21:21:36    28``````

This will display your dates in the familiar “day/month/year” format.

So there you have it, fellow R enthusiasts! With these tools, you can confidently handle Excel’s date quirks and unleash the power of your data. Happy coding!