Working with Dates in R

February 26, 2014 | Alan

This blog provides details on working with dates in R. The examples used in this blog are based on arbitrarily chosen dates.

Related Blog Posts

Learning Outcomes

In this blog, we learn about working with dates in R. Working with dates in R can potentially be troublesome, in particular because there are so many different formatting structures for dates. We can normally use the built-in R functions to deal with dates, but there are also some extended packages to help make things easier. We first look at the built-in functions, before looking at one of the extended packages, lubridate. This blog introduces:

Workflow

Step 1 – Defining variables as dates

We can use the as.Date() function to create date objects in R for variables which do not have a time or time zone component. The input for this function will depend on the data type of the variable to be converted to a date object. The as.Date() function accepts character strings, factor, logical, NA and objects of classes ‘POSIX1t’ and ‘POSIXct’. When columns of type Date are imported into the AnalytiXagility platform, they are represented as class POSIX1t or POSIXct:

  • Class POSIXct represents the (signed) number of seconds since the beginning of 1970 as a numeric vector.
  • Class POSIXlt is a named list of vectors representing seconds, minutes, hours, day, month, year, etc.

Consider the string ‘19001118’ (which has format yyyymmdd), where there is no separation between the year, month and date. Before this can be defined as a date, it must firstly be converted to a character string:

date <- as.character(19901118)

The as.Date() function can now be applied, where the format must be specified as the default yyyy-mm-dd. A list of possible formats will be given in Step 2. This is the first step that would be carried out if the variable was of type character or factor. The type or class of the variable can be determined by applying the class() function:

date <- as.Date(date, "%Y%m%d")

If we now apply the class() function to this, we will see that it is of type date:

class(date)
## [1] "Date"

Step 2 – Formatting dates in R

A date which is in the default format (yyyy-mm-dd) can be converted to another format of your choice by using the format() function. The code chunk below shows the syntax for this:

format(date, format = "form")

The following table shows the values that can be specified:

Table

The following examples illustrate some ways of using the format() function. Note that the date must be in the default format (yyyy-mm-dd).

To convert the date 2009-03-07 to the format yyyy-mm-dd:

date <- as.Date(as.character(20090307), "%Y%m%d")
date
## [1] "2009-03-07"

To convert the date 1990-11-18 to the format dd/mm/yy:

date <- as.Date(as.character(19901118), "%Y%m%d")
format(date, format = "%d/%m/%y")
## [1] "18/11/90"

To convert the date 1954-07-24 to the format B d Y:

date <- as.Date(as.character(19540724), "%Y%m%d")
format(date, format = "%B %d %Y")
## [1] "July 24 1954"

To convert the date 1987-06-26 to the format d.m.y:

date <- as.Date(as.character(19870626), "%Y%m%d")
format(date, format = "%d.%m.%Y")
## [1] "26.06.1987"

To convert the date 1957-05-25 to the format m-d-Y:

date <- as.Date(as.character(19570525), "%Y%m%d")
format(date, format = "%m%d%Y")
## [1] "05251957"

Step 3 – Useful functions for working with dates

There are a number of useful functions that can be used for working with dates in R. Examples of some of these are given in the code chunks below. Note that the date must be in the default format (yyyy-mm-dd).

  1. Sys.Date() – This function outputs today’s date in the default format. You can change this to a different format, if required, using the format() function. For more information on this,  see Step 2:
    today <- Sys.Date()
    today
    
    ## [1] "2014-02-19"
    
  2. weekdays() – This function outputs the day of the week that corresponds to the supplied input date:
    weekdays(today)
    
    ## [1] "Wednesday"
    
  3. months(x) – This function outputs the month that corresponds to the supplied input date:
    months(today)
    
    ## [1] "February"
    
  4. seq() – This function generates a sequence of dates. It takes parameters of:
  • a starting date
  • the period between dates in the sequence
  • the number of dates that are to be generated in the sequence.

For example, to generate a sequence of six dates starting from 2014-01-18 and separated by a period of a day, use:

date <- as.Date(as.character(20140118), "%Y%m%d")
seq(date, by = "1 day", length.out = 6)
## [1] "2014-01-18" "2014-01-19" "2014-01-20" "2014-01-21" "2014-01-22"
## [6] "2014-01-23"

5. quarters()– This function outputs the quarter of the year that corresponds to the supplied input date:

quarters(today)
## [1] "Q1"

6. julian()– This function outputs the number of days since the ‘origin’. For more information about the origin, see Step 4:

julian(today, origin = as.POSIXct("1970-01-01"))
## [1] 19720
## attr(,"tzone")
## [1] ""
## attr(,"origin")
## [1] "1970-01-01 BST"

Step 4 – Dates in numeric form, with an origin specified

As well as representing a vector of character strings, the vector of numbers can represent the number of days since a specific date. This specific date is known as the origin. By default in R, this is 1st January 1970. Different origins are needed in Excel and SAS. A negative number indicates that the date is before the origin, while a positive number indicates that the date is after the origin. The code chunk below shows the syntax for the as.Date() function when the input is in this numeric form:

as.Date(date, origin = "1970-01-01")

Note that the origin must be specified in the format yyyy-mm-dd, which is the default format for dates in R. In the code chunk below we consider a date value given as the numeric value 7626, which is the number of days that have passed since January 1st 1970. To define this as a date:

as.Date(7626, origin = "1970-01-01")
## [1] "1990-11-18"

Step 5 – Using the R package Lubridate

The R package lubridate has functions which allow you to specify the order in which year, month and day components appear in dates. To load this package, use the syntax:

library(lubridate)

These functions have the following syntax:

  1. ymd()
    ymd("1990/11/18")
    
    ## [1] "1990-11-18 UTC"
    
  2. mdy()
    mdy("11/18/1990")
    
    ## [1] "1990-11-18 UTC"
    
  3. dmy()
    dmy("18/11/1990")
    
    ## [1] "1990-11-18 UTC"

What’s Next

This post has covered the basics of working with dates. Other posts look in more detail at working with date-times and formatting dates.


 

Comments (2)

  1. Bhaskar Reply

    September 28, 2017 at 11:36 am

    how to calculate YTD and Last year same period in R?

    1. Pamela Brankin Reply

      October 9, 2017 at 11:27 am

      Hi, thanks for your comment. We would recommend using the lubridate package introduced at the end of the blog post (http://lubridate.tidyverse.org/) for performing calculations with date objects. There’s also a blog post here that is trying to achieve something similar – http://lover4analytics.blogspot.co.uk/2017/02/ytd-mtd-qtd-wtd-calculations-using.html.

      Hope that helps!
      Pamela

Leave a Reply

Your email address will not be published. Required fields are marked *