Taming Excel Dates in R: From Numbers to Meaningful Dates!

code
rtip
timeseries
Author

Steven P. Sanderson II, MPH

Published

February 5, 2024

Introduction

Have you ever battled with Excel’s quirky date formats in your R projects? If so, you’re not alone! Those cryptic numbers can be a real headache, but fear not, fellow R warriors! Today, we’ll conquer this challenge and transform those numbers into beautiful, usable dates.

Our Mission: We’ll convert two date columns in a tibble named “df”:

  1. date: Stored as numbers, representing days since some mysterious date.
  2. datetime: Also in numberland, but with an additional decimal for time.

Our Weapons:

  • as.Date(): This built-in R function is our date-conversion hero, but we need to give it a secret weapon: origin = "1899-12-30". This tells as.Date() where the Excel date system starts counting days from.
  • openxlsx library: This package helps us deal with Excel files. We’ll use its convertToDateTime() function to handle the datetime column, which includes both date and time information.

Let’s Code!

# Install and load the openxlsx library (if needed)
if (!require(openxlsx)) install.packages("openxlsx")
library(openxlsx)

# Our example data
df <- data.frame(
  date = c(44563, 44566, 44635, 44670, 44706, 44716, 44761, 44782, 44864, 44919),
  datetime = c(44563.17, 44566.51, 44635.64, 44670.40,
               44706.43, 44716.42, 44761.05, 44782.09,
               44864.19, 44919.89),
  sales = c(14, 19, 22, 29, 24, 25, 25, 30, 35, 28)
)

df
    date datetime sales
1  44563 44563.17    14
2  44566 44566.51    19
3  44635 44635.64    22
4  44670 44670.40    29
5  44706 44706.43    24
6  44716 44716.42    25
7  44761 44761.05    25
8  44782 44782.09    30
9  44864 44864.19    35
10 44919 44919.89    28
# Convert "date" column using as.Date() and the magic origin
df$date <- as.Date(df$date, origin = "1899-12-30")

# Convert "datetime" column using openxlsx and convertToDateTime()
df$datetime <- convertToDateTime(df$datetime)

Breaking it Down

  1. The first line checks if openxlsx is installed and loads it if needed.
  2. We create our sample data frame df with the date and datetime columns.
  3. The magic happens! We use as.Date() on df$date, specifying the origin as “1899-12-30”. This tells R to interpret the numbers as days since that date.
  4. For df$datetime, we use convertToDateTime() from the openxlsx package. This function handles both date and time information stored as decimals.

Voila! Our df now has proper date and datetime columns, ready for further analysis and visualization. Let’s see the results:

head(df, 1)
        date            datetime sales
1 2022-01-02 2022-01-02 04:04:48    14

You’re Turn!

Now it’s your turn! Grab your own Excel data with mysterious date formats and try this code. Play with different origin values if needed (depending on your Excel version). Remember, R is a playground, so have fun exploring and taming those dates!

Bonus Tip: Want to format your dates for readability? Use the format() function, like this:

df$date <- format(df$date, "%d/%m/%Y")
df
         date            datetime sales
1  02/01/2022 2022-01-02 04:04:48    14
2  05/01/2022 2022-01-05 12:14:24    19
3  15/03/2022 2022-03-15 15:21:36    22
4  19/04/2022 2022-04-19 09:36:00    29
5  25/05/2022 2022-05-25 10:19:12    24
6  04/06/2022 2022-06-04 10:04:48    25
7  19/07/2022 2022-07-19 01:12:00    25
8  09/08/2022 2022-08-09 02:09:36    30
9  30/10/2022 2022-10-30 04:33:36    35
10 24/12/2022 2022-12-24 21:21:36    28

This will display your dates in the familiar “day/month/year” format.

So there you have it, fellow R enthusiasts! With these tools, you can confidently handle Excel’s date quirks and unleash the power of your data. Happy coding!