This blog provides details on working with dates in R. The examples used in this blog are based on arbitrarily chosen dates.
In this blog, we learn about working with dates in R. Working with dates in R can potentially be troublesome, in particular because there are so many different formatting structures for dates. We can normally use the built-in R functions to deal with dates, but there are also some extended packages to help make things easier. We first look at the built-in functions, before looking at one of the extended packages,
lubridate. This blog introduces:
We can use the
as.Date() function to create date objects in R for variables which do not have a time or time zone component. The input for this function will depend on the data type of the variable to be converted to a date object. The
as.Date() function accepts character strings, factor, logical, NA and objects of classes ‘POSIX1t’ and ‘POSIXct’. When columns of type Date are imported into the AnalytiXagility platform, they are represented as class POSIX1t or POSIXct:
Consider the string ‘19001118’ (which has format yyyymmdd), where there is no separation between the year, month and date. Before this can be defined as a date, it must firstly be converted to a character string:
date <- as.character(19901118)
as.Date() function can now be applied, where the format must be specified as the default yyyy-mm-dd. A list of possible formats will be given in Step 2. This is the first step that would be carried out if the variable was of type character or factor. The type or class of the variable can be determined by applying the
date <- as.Date(date, "%Y%m%d")
If we now apply the
class() function to this, we will see that it is of type date:
##  "Date"
A date which is in the default format (yyyy-mm-dd) can be converted to another format of your choice by using the
format() function. The code chunk below shows the syntax for this:
format(date, format = "form")
The following table shows the values that can be specified:
The following examples illustrate some ways of using the
format() function. Note that the date must be in the default format (yyyy-mm-dd).
To convert the date 2009-03-07 to the format
date <- as.Date(as.character(20090307), "%Y%m%d") date
##  "2009-03-07"
To convert the date 1990-11-18 to the format
date <- as.Date(as.character(19901118), "%Y%m%d") format(date, format = "%d/%m/%y")
##  "18/11/90"
To convert the date 1954-07-24 to the format
B d Y:
date <- as.Date(as.character(19540724), "%Y%m%d") format(date, format = "%B %d %Y")
##  "July 24 1954"
To convert the date 1987-06-26 to the format
date <- as.Date(as.character(19870626), "%Y%m%d") format(date, format = "%d.%m.%Y")
##  "26.06.1987"
To convert the date 1957-05-25 to the format
date <- as.Date(as.character(19570525), "%Y%m%d") format(date, format = "%m%d%Y")
##  "05251957"
There are a number of useful functions that can be used for working with dates in R. Examples of some of these are given in the code chunks below. Note that the date must be in the default format (yyyy-mm-dd).
Sys.Date()– This function outputs today’s date in the default format. You can change this to a different format, if required, using the
format()function. For more information on this, see Step 2:
today <- Sys.Date() today
##  "2014-02-19"
weekdays()– This function outputs the day of the week that corresponds to the supplied input date:
##  "Wednesday"
months(x)– This function outputs the month that corresponds to the supplied input date:
##  "February"
seq()– This function generates a sequence of dates. It takes parameters of:
For example, to generate a sequence of six dates starting from 2014-01-18 and separated by a period of a day, use:
date <- as.Date(as.character(20140118), "%Y%m%d") seq(date, by = "1 day", length.out = 6)
##  "2014-01-18" "2014-01-19" "2014-01-20" "2014-01-21" "2014-01-22" ##  "2014-01-23"
quarters()– This function outputs the quarter of the year that corresponds to the supplied input date:
##  "Q1"
julian()– This function outputs the number of days since the ‘origin’. For more information about the origin, see Step 4:
julian(today, origin = as.POSIXct("1970-01-01"))
##  19720 ## attr(,"tzone") ##  "" ## attr(,"origin") ##  "1970-01-01 BST"
As well as representing a vector of character strings, the vector of numbers can represent the number of days since a specific date. This specific date is known as the origin. By default in R, this is 1st January 1970. Different origins are needed in Excel and SAS. A negative number indicates that the date is before the origin, while a positive number indicates that the date is after the origin. The code chunk below shows the syntax for the
as.Date() function when the input is in this numeric form:
as.Date(date, origin = "1970-01-01")
Note that the origin must be specified in the format yyyy-mm-dd, which is the default format for dates in R. In the code chunk below we consider a date value given as the numeric value 7626, which is the number of days that have passed since January 1st 1970. To define this as a date:
as.Date(7626, origin = "1970-01-01")
##  "1990-11-18"
The R package
lubridate has functions which allow you to specify the order in which year, month and day components appear in dates. To load this package, use the syntax:
These functions have the following syntax:
##  "1990-11-18 UTC"
##  "1990-11-18 UTC"
##  "1990-11-18 UTC"
This post has covered the basics of working with dates. Other posts look in more detail at working with date-times and formatting dates.Tweet