Visualising Features of A&E Waiting Time Data Using Run Charts

February 27, 2014 | Alan

In analytics, run charts are used to identify and display trends in data over time. In this post we show how to use the AnalytiXagility platform to plot a basic run chart to visualise features of A&E attendance data over time.

Output

This exemplar describes how to produce the following plot:
run chart basic - corrected

Related Blog Posts

This post is one of a series that is intended to be read in sequence. The following posts provide a full description of:

  1. The Use of Run Charts in Health Informatics
  2. Visualising Features of A&E Waiting Time Data Using Run Charts
  3. Visualising Variant CPs due to changing targets of A and E waiting time data using run charts

Data Source

The data we have used in this article has been sourced from NHS England: Weekly A&E SitReps 2013-14. This data has been cleaned up, converted to a CSV file, and then loaded into the AnalytiXagility platform. You can ask for a copy to be made available to your workspace. The data is called ae_england_20140105 in the AnalytiXagility workspace.

Learning Outcomes

In this post we:

  • Demonstrate techniques used in the construction of a layered plot, including formatting labels and annotating the plot using ggplot2.
  • Explore the manipulation and consideration of date data using scales.
    Workflow

    Step 1 – Set up environment

    Load the relevant libraries:

    library(ggplot2)
    library(scales)
    

    Import the data into R console workspace:

    ae_data <- xap.read_table("ae_england_20140105")
    

    Use the head() function to have a quick look at a section of the imported data:

    head(ae_data)
    

    The imported dataset ae_data includes percentage values of A&E attendees seen within 4 hours. The target, as set by NHS England, dictates that 95% of attendees are seen within this time. Let’s look first at fluctuations around the target over time; to do this, we first use the class() function to examine data types of ae_data$period and ae_data$percentagein4hoursorless_type1.:

    class(ae_data$period)
    
    class(ae_data$percentagein4hoursorless_type1)
    ## [1] "POSIXct" "POSIXt"
    ## [1] "numeric"
    

    This tells us that ae_data$period is a POSIXdt POSIXdt, which is an S3 class for datetime representation.

    Define date and metric variables:

    run_date <- as.POSIXct(ae_data$period, format = "%d/%m/%Y")
    run_metric <- ae_data$percentagein4hoursorless_type1
    

    To select a specific time period, use the format function to extract the year from run_date and create an index. We will look at A&E attendances in 2012 and 2013 so we apply the derived index to re-evaluate run_date and run_metric:

    idx <- format(run_date, "%Y") == "2013" | format(run_date, "%Y") == "2012"
    run_date <- run_date[idx]
    run_metric <- run_metric[idx]
    

    As outlined in The Fundamentals of ggplot Explained, ggplot2 executes on data represented as a data frame. For plotting purposes, save the data as a data frame and name the columns accordingly:

    to_plot <- data.frame(run_date, run_metric)
    colnames(to_plot) <- c("run_date", "run_metric")
    

    Step 2 – Derive basic statistics

    Calculate the CP (centre point) line, from the documentation. For a run chart, this is the median:

    CP <- median(to_plot$run_metric, na.rm = TRUE)
    

    Compute the upper and lower limits of this dataset:

    std_dev <- sd(to_plot$run_metric, na.rm = TRUE)
    three_sd <- 3 * std_dev
    UL <- CP + three_sd
    LL <- CP - three_sd
    

    The target for A&E attendees is 95%:

    target <- 95
    

    Step 3 – Ready, set, plot!

    In the recent blog post The Fundamentals of ggplot Explained, we introduced the basic concepts of ggplot2 and layering. We can start constructing our plot by creating a ggplot()object with labels and setting the colour scheme:

    a <- ggplot() + ggtitle("Run Chart") + xlab("Time stamp") + ylab("Metric") + 
        theme(panel.background = element_rect(fill = "white", colour = "black"))
    

    Add some layers to a. Feed the dataframe to_plot into functions geom_point() and geom_line(), and set aesthetic values x and y:

    b <- a + geom_point(data = to_plot, aes(x = run_date, y = run_metric)) + geom_line(data = to_plot, 
        aes(x = run_date, y = run_metric))
    b
    

    plot of chunk unnamed-chunk-13

    We can add features to the run chart by using geom_hline(); for example, to add CP,UL, LL and target information derived in Step 2 as layers to plot b:

    c <- b + geom_hline(aes(yintercept = CP), colour = "red") + geom_hline(aes(yintercept = UL), 
        colour = "black") + geom_hline(aes(yintercept = LL), colour = "black") + 
        geom_hline(aes(yintercept = target), colour = "darkgreen")
    c
    

    plot of chunk unnamed-chunk-14

    For more information:

    Data in date format can be manipulated using functions from the scale library. We can vary the number of breaks on the x-axis and how it is visually represented using scale_x_date():

    d <- c + scale_x_datetime(breaks = date_breaks("months"), labels = date_format("%b-%Y"))
    d
    

    plot of chunk unnamed-chunk-15

    In plot d there are overlapping x-axis labels. To remedy this, use theme() to rotate text labels. In this case we have chosen 90 degrees:

    e <- d + theme(axis.text.x = element_text(angle = 90, hjust = 1))
    e
    

    plot of chunk unnamed-chunk-16

    Step 4 – Annotate plot

    We can add text to a plot using a geom_text() layer, so let’s append the statistic values derived in Step 2 to plot e. Each text group must have defined x and y co-ordinates, and we can use the data to identify these.

    UL_text <- paste("UL = ", round(UL, 2), sep = "")
    LL_text <- paste("LL = ", round(LL, 2), sep = "")
    CP_text <- paste("CP = ", CP, sep = "")
    target_text <- paste("target = ", target, sep = "")
    x_coord <- as.POSIXct("2012-02-01")
    
    f <- e + geom_text(aes(x = x_coord, y = UL - 0.5, label = UL_text), color = "black") + 
        geom_text(aes(x = x_coord, y = LL - 0.5, label = LL_text), color = "black") + 
        geom_text(aes(x = x_coord, y = target + 1.5, label = CP_text), color = "red") + 
        geom_text(aes(x = x_coord, y = target + 2, label = target_text), color = "darkgreen")
    f
    

    plot of chunk unnamed-chunk-17

    Step 5 – Tracing changes

    Run charts are used extensively in analytics to track results from an implemented change. Let’s annotate our plot to reflect an implemented change and discuss the results. We first look at a single date, and then change over a period of time. To set a date to highlight on our plot and the text to label this, use:

    date_implemented <- as.POSIXct("2012-08-01")
    text_implemanted <- "Change implemented"
    

    Use geom_vline() and geom_text() to add a line and text to represent a change implemented, setting the text to just below the UL line:

    g <- f + geom_vline(aes(xintercept = as.numeric(date_implemented)), colour = "blue") + 
        geom_text(aes(x = date_implemented + 120, y = UL - 2, label = text_implemanted), 
            color = "blue")
    g
    

    plot of chunk unnamed-chunk-19

    To highlight a region of change, draw a transparent rectangle using geom_rect(), and feed in minimum and maximum x and y values. Define the start and end dates:

    start_date <- as.POSIXct("2013-06-01")
    end_date <- as.POSIXct("2013-07-01")
    

    Finally we layer this onto plot:

    h <- g + geom_rect(aes(alpha = "period", xmin = start_date, xmax = end_date, 
        ymin = -Inf, ymax = Inf), fill = "green", colour = "green")
    h
    

    plot of chunk unnamed-chunk-21

    What’s next?

    We delve further in to the world of run charts in the next blog post in the series,  by building an algorithm that captures special cases to identify trends in the data and writes this into a function.
    run chart advanced visualistion (2)


 

Leave a Reply

Your email address will not be published. Required fields are marked *