Working with Date-times and Time Zones in R

February 26, 2014 | Alan

The purpose of this blog is to work with dates in R which have times and time zones. All variables of class Date which are imported into the AnalytixAgility platform, are represented by a date-time. An open dataset will be used in the course of this blog to provide examples.

Related blog posts

Data Source

The dataset used in this blog is the lakers data from R, within the package lubridate. It contains play by play statistics of each LA Lakers basketball game in the 2008-2009 season, including the date and time on the clock at which play was made.

Learning Outcomes

This blog will introduce users to working with date-times and time zones in R, both with in-built functions plus functionality provided via the lubridate package. It will introduce:

  • Objects of classes POSIX (to include POSIXct and POSIXlt)
  • Extracting date and time components from POSIX objects
  • Creating POSIXct objects using the ISOdate() function
  • Time zones

Workflow

Step 1- Reading in the data

The data used for this blog is from the R package lubridate. The code chunk below shows how to read this data into R. This automatically assigns the data to a variable lakers.

library(lubridate)
data(lakers)

The first few rows of this data can be viewed using the head() function:

head(lakers)
##       date opponent game_type  time period     etype team
## 1 20081028      POR      home 12:00      1 jump ball  OFF
## 2 20081028      POR      home 11:39      1      shot  LAL
## 3 20081028      POR      home 11:37      1   rebound  LAL
## 4 20081028      POR      home 11:25      1      shot  LAL
## 5 20081028      POR      home 11:23      1   rebound  LAL
## 6 20081028      POR      home 11:22      1      shot  LAL
##                player result points  type  x  y
## 1                                 0       NA NA
## 2           Pau Gasol missed      0  hook 23 13
## 3 Vladimir Radmanovic             0   off NA NA
## 4        Derek Fisher missed      0 layup 25  6
## 5           Pau Gasol             0   off NA NA
## 6           Pau Gasol   made      2  hook 25 10

Step 2 – Data Manipulation

For the purposes of the examples shown in this blog, I am going to go through a number of data manipulation steps, just so that we are only focussing on a very small subset of the lakers data, with only the date and time fields we are interested in. Firstly, a lot of the dates in this data are duplicates and I want the dates in my sample to all be different. So in the code chunk below, I will create a subset in which I am only including rows which have a unique date (or date which is not duplicated). This will be assigned to the variable lakers_unique_dates. See duplicated for more information.

lakers_unique_dates <- lakers[!duplicated(lakers$date), ]

The code chunk below takes a subset of lakers_unique_dates which only includes the date and time fields- this is all that we need for the purposes of this blog. We will then only consider the first 5 rows of this subset for the examples in the blog.

lakers_unique_dates_times <- subset(lakers_unique_dates, select = c(date, time))
lakers_unique_dates_times_subset <- lakers_unique_dates_times[1:5, ]
rownames(lakers_unique_dates_times_subset) <- NULL
lakers_unique_dates_times_subset
##       date  time
## 1 20081028 12:00
## 2 20081029 12:00
## 3 20081101 12:00
## 4 20081105 12:00
## 5 20081109 12:00

The next code chunk will convert the integers in the date field to character strings before converting to date format. See Working with Dates in R. When using the as.Date() function, we must specify the format that the initial input is in. Here it is the format “%Y%m%d”, a character string in which there is no separation between the year, month and day components. A table of the different forms available can be found in Working with Dates in R.

lakers_unique_dates_times_subset$date <- as.character(lakers_unique_dates_times_subset$date)
lakers_unique_dates_times_subset$date <- as.Date(lakers_unique_dates_times_subset$date, 
    "%Y%m%d")
lakers_unique_dates_times_subset
##         date  time
## 1 2008-10-28 12:00
## 2 2008-10-29 12:00
## 3 2008-11-01 12:00
## 4 2008-11-05 12:00
## 5 2008-11-09 12:00

This can also be done using the parse functions in the package lubridate. See Working with Dates in R.

The next code chunk will combine the date and time fields from the variable lakers_unique_dates_times_subset into one object. This will be assigned to the object dates_times, where the elements are character strings:

dates_times <- paste(lakers_unique_dates_times_subset$date, lakers_unique_dates_times_subset$time)
dates_times
## [1] "2008-10-28 12:00" "2008-10-29 12:00" "2008-11-01 12:00"
## [4] "2008-11-05 12:00" "2008-11-09 12:00"
class(dates_times)
## [1] "character"

Step 3- Objects of class POSIX

Objects of class POSIX can be thought to be more accurate than objects of class Date. This is because time is stored to the nearest second, rather than to the nearest day. Any column of class Date that is imported into the RA, is actually represented by class POSIX (basically a date represented by a date-time) So in the RA an object of class Date will be represented by a POSIX object, with time components zero. There are two POSIX date-time classes, which have slight differences in the way in which they store elements:

  • Class POSIXct represents the (signed) number of seconds since the beginning of 1970 as a numeric vector. Time zone is included, default of “UTC” (Coordinated Universal time zone) or “GMT” (Greenwich Mean Time) is given if a time zone is not specified.
  • Class POSIXlt is a named list of vectors representing year, month, day, hours, minutes, seconds. A time zone is not output if it was not specified. Time zones will be covered in Step 6.

To convert to class POSIXct, use the as.POSIXct() function. The code chunk below shows the application of this function to the object dates_times.

type_posixct <- as.POSIXct(dates_times)
type_posixct
## [1] "2008-10-28 12:00:00 GMT" "2008-10-29 12:00:00 GMT"
## [3] "2008-11-01 12:00:00 GMT" "2008-11-05 12:00:00 GMT"
## [5] "2008-11-09 12:00:00 GMT"
class(type_posixct)
## [1] "POSIXct" "POSIXt"

To convert to class POSIXlt use the as.POSIXlt() function. The code chunk below show that application of this function to the object dates_times.

type_posixlt <- as.POSIXlt(dates_times)
type_posixlt
## [1] "2008-10-28 12:00:00" "2008-10-29 12:00:00" "2008-11-01 12:00:00"
## [4] "2008-11-05 12:00:00" "2008-11-09 12:00:00"
class(type_posixlt)
## [1] "POSIXlt" "POSIXt"

The difference between the storage of elements in class POSIXct and POSIXlt can be seen from using the unclass() function in the code chunk below, see unclass. It can be seen that an object of class POSIXct is stored as a numeric vector, while an object of class POSIXlt is stored as a list.

unclass(type_posixct)
## [1] 1.225e+09 1.225e+09 1.226e+09 1.226e+09 1.226e+09
## attr(,"tzone")
## [1] ""
unclass(type_posixlt)
## $sec
## [1] 0 0 0 0 0
## 
## $min
## [1] 0 0 0 0 0
## 
## $hour
## [1] 12 12 12 12 12
## 
## $mday
## [1] 28 29  1  5  9
## 
## $mon
## [1]  9  9 10 10 10
## 
## $year
## [1] 108 108 108 108 108
## 
## $wday
## [1] 2 3 6 3 0
## 
## $yday
## [1] 301 302 305 309 313
## 
## $isdst
## [1] 0 0 0 0 0

Step 4- Extracting Components from Date-times

There are a couple of useful built-in functions in R which can be used to extract components from date-times.

The strptime() function converts character strings to class POSIXlt and then allows you to extract the required component. In the code chunk below, we begin with the object type_posixct, which is of class POSIXct. Applying the strptime() function to this, we see that it is converted to class POSIXlt. It should be noted that the letters “Y, m, d, H, M, S” represent years, months, days, hours, minutes and seconds respectively.

strptime(type_posixct, format = "%Y-%m-%d %H:%M:%S")
## [1] "2008-10-28 12:00:00" "2008-10-29 12:00:00" "2008-11-01 12:00:00"
## [4] "2008-11-05 12:00:00" "2008-11-09 12:00:00"
class(strptime(type_posixct, format = "%Y-%m-%d %H:%M:%S"))
## [1] "POSIXlt" "POSIXt"

To extract the year component, use the following syntax:

strptime(type_posixct, format = "%Y-%m-%d %H:%M:%S")$year
## [1] 108 108 108 108 108

For the other date components of month and day of month, use the syntax “mon” and “mday” respectively.

To extract the hours component, use the following syntax:

strptime(type_posixct, format = "%Y-%m-%d %H:%M:%S")$hour
## [1] 12 12 12 12 12

For the other time components of minutes and seconds, use the syntax “min” and “sec” respectively.

The strftime() function converts POSIX objects to character vectors. The code chunk below shows how to extract the date component from the object type_posixct. We can see that the new object is of class character.

strftime(type_posixct, format = "%Y%m%d")
## [1] "20081028" "20081029" "20081101" "20081105" "20081109"
class(strftime(type_posixct, format = "%Y%m%d"))
## [1] "character"

The next code chunk shows how to extract the time component from the object type_posixct:

strftime(type_posixct, format = "%H:%M:%S")
## [1] "12:00:00" "12:00:00" "12:00:00" "12:00:00" "12:00:00"

The following code chunks will generate a set of objects that we will go on to use in Step 5. To extract the year components from type_posixct:

years <- strftime(type_posixct, format = "%Y")
years
## [1] "2008" "2008" "2008" "2008" "2008"

To extract the month components from type_posixct:

months <- strftime(type_posixct, format = "%m")
months
## [1] "10" "10" "11" "11" "11"

To extract the day components from type_posixct:

days <- strftime(type_posixct, format = "%d")
days
## [1] "28" "29" "01" "05" "09"

To extract the hours components from type_posixct:

hours <- strftime(type_posixct, format = "%H")
hours
## [1] "12" "12" "12" "12" "12"

To extract the minutes components from type_posixct:

mins <- strftime(type_posixct, format = "%M")
mins
## [1] "00" "00" "00" "00" "00"

To extract the seconds components from type_posixct:

secs <- strftime(type_posixct, format = "%S")
secs
## [1] "00" "00" "00" "00" "00"

The package lubridate can also be used for extracting components, but you do not need to specify a format, as above. This package was loaded in Step 1, so we do not need to load it again here. It provides the following extraction functions, which can be applied to a single date or vector of dates: year(), week(), etc. For month and weekday there is the added functionality that you can specify if you want the full or abbreviated names. To find the abbreaviated months:

month(type_posixct, label = TRUE)
## [1] Oct Oct Nov Nov Nov
## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec

To find the names of the weekdays in non-abbreviated form:

wday(type_posixct, label = TRUE, abbr = FALSE)
## [1] Tuesday   Wednesday Saturday  Wednesday Sunday   
## 7 Levels: Sunday < Monday < Tuesday < Wednesday < Thursday < ... < Saturday

Step 5- Creating POSIXct dates using the ISOdate() function

The ISOdate() function can be used to convert components of a date-time into a POSIXct object. It will take components in order of year, month, day, hour, minute, second. Trying to convert an invalid date-time will result in NA. It can be used to convert a single set of date-time components or alternatively an entire vector of date-time components. We will firstly use it for a single set of components, the first element of the object type_posixct:

type_posixct[1]
## [1] "2008-10-28 12:00:00 GMT"
ISOdate(2008, 10, 28, 12, 0, 0)
## [1] "2008-10-28 12:00:00 GMT"

As an alternative approach we can use the vectors of years, months, days, etc that were generated in Step 4.

ISOdate(years, months, days, hours, mins, secs)
## [1] "2008-10-28 12:00:00 GMT" "2008-10-29 12:00:00 GMT"
## [3] "2008-11-01 12:00:00 GMT" "2008-11-05 12:00:00 GMT"
## [5] "2008-11-09 12:00:00 GMT"

Step 6- Time zones

The function Sys.time() returns the current date and time, including the time zone.

Sys.time()
## [1] "2014-02-03 12:10:26 GMT"

The function date() also returns the current date and time, including the day of the week and time zone.

date()
## [1] "Mon Feb 03 12:10:26 2014"

The function Sys.timezone() returns the time zone.

Sys.timezone()
## [1] "GMT"

For a list of time zones recognised by R, see timezones.

When converting an object to class POSIXct, the default time zone is given as UTC (Coordinated Universal time zone) or GMT (Greenwich Mean Time) Bascially, the clock time is given as it appears in UTC.

To convert to a different time zone, use the format() function. To convert the object type_posixct to US time:

format(type_posixct, tz = "America/Los_Angeles")
## [1] "2008-10-28 05:00:00" "2008-10-29 05:00:00" "2008-11-01 05:00:00"
## [4] "2008-11-05 04:00:00" "2008-11-09 04:00:00"

Time zones can also be converted using the package lubridate. There are two functions to do this:

  1. with_tz() changes the clock time but the specific instant remains the same. In the example below, we can see the time has been changed on the clock to be the equivalent US time.
    with_tz(type_posixct, "America/Los_Angeles")
    
    ## [1] "2008-10-28 05:00:00 PDT" "2008-10-29 05:00:00 PDT"
    ## [3] "2008-11-01 05:00:00 PDT" "2008-11-05 04:00:00 PST"
    ## [5] "2008-11-09 04:00:00 PST"
    
  2. force_tz() changes the actual instant of the time zone but the display clock remains the same. In the example below we can see that the times have remained the same, but we are alerted to the difference by the change in time zone.
    force_tz(type_posixct, "America/Los_Angeles")
    
    ## [1] "2008-10-28 12:00:00 PDT" "2008-10-29 12:00:00 PDT"
    ## [3] "2008-11-01 12:00:00 PDT" "2008-11-05 12:00:00 PST"
    ## [5] "2008-11-09 12:00:00 PST"
    

Care must be taken when working with time zones as daylight savings time may need to be considered. This is when in certain parts of the world the clocks are put back an hour in the Autumn and put forward an hour in the Spring. To avoid problems with this, it is best to use UTC, which does not adopt daylight savings hours.

What’s next?

This post has covered the basics of working with date-times and timezones. Subsequent posts will look in more detail at formatting dates and date-times.

Further Reading


 

Leave a Reply

Your email address will not be published. Required fields are marked *