Blogs & News

Home Blogs & News

Investigating Prescribing Data and Costs Using New Mini-apps and ggvis

Following on from last week’s post where I introduced our new mini-apps capability, today I’m going to demonstrate how to create more advanced and interactive mini-apps. I will also introduce the ggvis plotting library that allows us to create interactive plots. I’m going to do that by investigating prescribing data and related costs for three types of medications – statins, diabetes and antidepressants – in London between 2011-2013. The result will be an interactive map that will allow us to explore and connect with the data in detail. Have a look at the video, then I’ll walk you through the steps I’ve taken to get there.



Learning outcomes

This post covers:

  • How to create more complicated mini-apps
  • How to use ggvis inside a mini-app
  • How to add tooltips in ggvis plots

Background

Statins

Statins are medicines which lower the level of cholesterol in the blood. High levels of ‘bad cholesterol’ can increase the risk of having a heart attack or stroke and of developing cardiovascular disease.
www.bbc.co.uk/news/health-18101554

The National Institute for Health and Care Excellence (NICE) says the scope for offering this treatment should be widened to save more lives. The NHS currently spends about £450 million a year on statins. If the draft recommendations go ahead, this bill will increase substantially, although the drugs have become significantly cheaper over the years. It is not clear precisely how many more people would be eligible for statin therapy under the new recommendations, but NICE says it could be many hundreds of thousands or millions.
www.bbc.co.uk/news/health-26132758

Antidepressants

The use of antidepressants rose significantly in England during the financial crisis and subsequent recession, with 12.5m more pills prescribed in 2012 than in 2007, a study has found. Researchers from the Nuffield Trust and the Health Foundation identified a long-term trend of increasing prescription of antidepressants, rising from 15m items in 1998 to 40 million in 2012. But the yearly rate of increase accelerated during the banking crisis and recession to 8.5%, compared to 6.7% before it.
www.theguardian.com/society/2014/may/28/-sp-antidepressant-use-soared-during-recession-uk-study

The report also found that rises in unemployment were associated with significant increases in the number of antidepressants dispensed and that areas with poor housing tended to see significantly higher antidepressant use.

Diabetes drugs

It is estimated that more than one in 17 people in the UK have diabetes. In 2014, 3.2 million people had been diagnosed, and by 2025 this number is estimated to grow to 5 million.
www.diabetes.org.uk/Documents/About%20Us/Statistics/Diabetes-key-stats-guidelines-April2014.pdf

Diabetes, when present in the body over many years, can give rise to all sorts of complications. These include heart disease, kidney disease, retinopathy and neuropathy. Diabetes is the leading cause of amputations.
www.diabetes.co.uk/diabetes-and-amputation.html

Data

The data we are using for this mini-app cover the prescriptions of statins, diabetic drugs and antidepressants in London between 2011-2013. The initial dataset is available through HSCIC. The data were aggregated to count how many drugs were prescribed and the cost of these by each year. The aggregated data were then filtered to include the 3 drugs of interest for London.

Step 1 – Building the mini-app

The data are ready to use and loaded in the platform, so let’s build the mini-app.

The UI

We can start by designing our UI. We have chosen three widgets: a slider for year selection, a drop-down menu for the variable to be plotted selection and a radio button for drug type selection. In this worked example we will keep the plotting functions empty for now and populate them later. We can divide the space in two columns, a smaller one for the widgets and a larger one for the plot:
ui.r

## ui.r
library(shiny)
library(ggvis)
shinyUI(fluidPage(
  titlePanel("Prescribing data London"),
  fluidRow(
    column(width = 3,
           wellPanel(
             sliderInput("year", 
                         "Select year to visualise: ", 2011, 2013, 
                         value = 2011, 
                         animate = F),
             br(),             
             selectInput("var", 
                         label = "Select variable to visualise", 
                         choices = list("% Items" = "items_perc", "Average cost (£)" = "act_cost_perc"), 
                         selected = "items_perc"),
             br(),
             radioButtons("drug", 
                          label="Select the drug to visualise",
                          choices= list("Diabetes drugs" = "diabetes", "Anti-depressant drugs" = "antidepressants", "Statins" = "statins"),
                          selected="statins")
           )
    ),
    column(9,
           wellPanel()
    )
  )
)
)

Using an empty shinyServer function this will look like:1_empty_presc_map_ldn

The server

We now have a functional UI with all the required widgets, so let’s start adding some functionality to the shinyServer function. We can start with reading in the data. Since the data we will be using won’t change for every session, we don’t have to read the data in a reactive piece of code:
server.r

# Load required libraries
library(dplyr)
library(ggvis)
# Set miniapp option
options(dplyr.length = 1e10)
# Miniapp server function
shinyServer(function(input, output, session) {
  ## Read the data 
  df <- xap.read_table("presc_bnf_ccg_summary_demo_ldn")
  }
)

It is good practice to divide up the functionality into small modules, so that’s what we’ll do.

Filtering the data according to the input

The first task is filtering the data according to the selection of the user. Define a reactive function, we’ll call it select_data():

  ## Filter the data according to the values of the widgets
  select_data <- reactive({

    # Get the widgets values
    yr <- input$year 
    var <- input$var
    drug <- input$drug

    # Filter the data
    selected_data <-  df  %>%
      filter(year == yr & case == drug) %>%
      select(ccg_code, id, year, case, act_cost, nic, ccg13nm, long, lat, order, group, items_perc, act_cost_perc) 

    # Select variable to be plotted
    if (var == "items_perc"){
      selected_data <- selected_data %>% mutate(var = items_perc)
    } 
    else if (var == "act_cost_perc"){
      selected_data <- selected_data %>% mutate(var = act_cost_perc)
    }

    # Return the selected data
    selected_data
  })

Here we filter df according to the selected year and the selected type of drug. Afterwards we add an extra column (named var) with the selected variable to be plotted. This extra column is going to help us create the map plot independently from the selected variable.

Summarising the data

Create a reactive function that will create some summary of our data, the summarise_data(). This is going to be our basis for the colour palette we will need to colour the map.

  ## Summarise the data   
  summarise_data <- reactive({

    # Get selected data
    selected_data <- select_data()

    # Group the data by CCG and calculate avg value of the selected variable
    summarised_data <- selected_data %>%
      group_by(ccg_code) %>%
      summarise(var=mean(var, na.rm = TRUE)) %>%
      arrange(var)

    # Return
    summarised_data
  })

Creating splines to segment the data and colour the map

Our data is ready to be plotted, but we still need to add colour in our map, according to the range of values of var. Essentially we want to automatically divide this range in three equally probable intervals that will represent low, moderate and high values. These three intervals will be coloured green, orange and red accordingly. We can do that using the quantile function.

  ## Get splines to segment and colour data by
  get_splines <- reactive({

    # Get selected data and summarised data
    selected_data <- select_data()
    c_palette <- summarise_data()

    # Define quartile splines to segment and colour data by 
    c_splines <- quantile(c_palette$var, probs = seq(0, 1, 1/3), na.rm = TRUE) 

    # Return splines
    c_splines
  })

Add colours to the selected data

Figuring out the splines to segment the data by is not enough. We also have segment the data point and assign colours to each one of them, so we are going to assign labels that represent the low, moderate and high values to each one of the data points. Then we are going to create a gradient scale of colours for each one of these intervals, this will give us the colour palette which we will have to combine with our initial data.

 ## Add the colour palette to the selected dataset   
  add_palette <- reactive({

    # Get selected data, summarised data as a basis for the colour palette and the quartile splines
    selected_data <- select_data()
    c_palette <- summarise_data()
    c_splines <- get_splines()

    # Define colour groups
    colour_group <- cut(
      c_palette$var, 
      c(0, c_splines[2:3], max(c_palette$var)), 
      labels=c("low", "moderate", "high")
    )
    c_palette$col_grp <- colour_group

    # Define gradient colours within each of the colour groups        
    c_palette$col_code <- c(colorRampPalette(c("#c8f1cb", "#4ad254"))(nrow(c_palette%>% filter(col_grp=="low"))),
                            colorRampPalette(c("#ffdb99","#ffb732"))(nrow(c_palette%>% filter(col_grp=="moderate"))),
                            colorRampPalette(c("#f69494","#ee2a2a"))(nrow(c_palette%>% filter(col_grp=="high"))))

    # Combine the selected data with the colour palette
    selected_data <- left_join(selected_data, c_palette)

    # Return the resulted data frame
    selected_data
  })

Map title

We are almost ready to move onto creating our map, but before that we want to add a dynamic title and some information about the colour scales and what values they represent. We can create a separate function for this task:

## A function that creates the map's title
create_maps_title <- function(maps_data, year = NULL, var = NULL, drug = NULL, c_splines) {
  # Set colours
  g_col = "#4ad254"
  o_col = "#ffb732"
  r_col = "#ee2a2a"
  # Set names to appear for certain options
  if (drug=="antidepressants"){
    drug = "anti-depressants"
  }
  else if (drug=="diabetes"){
    drug = "diabetic drugs"
  }
  # Create title's HTML code
  if (var=="items_perc"){
    lab1 <- paste0("Percentage of ", drug, " prescribed in ", year, " across populations in London")
    title <- paste0("<h3> ",
                    lab1,
                    "</h3>",
                    "<h4>",
                    "Groups: ",
                    " <font color='",g_col,"'>0 < ", round(c_splines[2],2),"%</font>",
                    " <font color='",o_col,"'>", round(c_splines[2],2), " < ", round(c_splines[3],2),"%</font>",
                    " <font color='",r_col, "'>", round(c_splines[3],2), "% +</font>",
                    "</h4>"
    )
  }
  else if (var=="act_cost_perc"){
    lab1 <- paste0("Average cost (£ per person) of ", drug, " prescribed in ", year, " corrected to CCG population sizes across London")
    title <- paste0("<h3> ",
                    lab1,
                    "</h3>",
                    "<h4>",
                    "Groups: ",
                    " <font color='",g_col,"'>£0 < ", round(c_splines[2],2),"</font>",
                    " <font color='",o_col,"'>£", round(c_splines[2],2), " < ", round(c_splines[3],2),"</font>",
                    " <font color='",r_col, "'>£", round(c_splines[3],2), "+</font>",
                    "</h4>"
    )
  }
  # Return title
  return(title)    
}      

and assign its result in the output variable through a render function:

  ## Add the map title to the UI
  output$map_title <- renderUI(
    HTML(
      create_maps_title(add_palette(), input$year, input$var, input$drug, get_splines())       
    )
  )

Finally, we have to add the title in the shinyUI function, in the preserved space:
ui.r

## ui.r
library(shiny)
library(ggvis)
shinyUI(fluidPage(
  titlePanel("Prescribing data London"),
  fluidRow(
    column(width = 3,
           wellPanel(
             sliderInput("year", 
                         "Select year to visualise: ", 2011, 2013, 
                         value = 2011, 
                         animate = F),
             br(),             
             selectInput("var", 
                         label = "Select variable to visualise", 
                         choices = list("% Items" = "items_perc", "Average cost (£)" = "act_cost_perc"), 
                         selected = "items_perc"),
             br(),
             radioButtons("drug", 
                          label="Select the drug to visualise",
                          choices= list("Diabetes drugs" = "diabetes", "Anti-depressant drugs" = "antidepressants", "Statins" = "statins"),
                          selected="statins")
           )
    ),
    column(9,
           wellPanel(
             htmlOutput("map_title", inline=F)
           )
    )
  )
)
)

2_title_presc_map_ldn

Step 2 – Interactive ggvis plotting library

Now we are finally ready to move into the actual goal of this article; creating the interactive map. Before that we should briefly explore the ggvis interactive plotting library in order better understand how to build our visualisation.

The ggvis basics

ggvis is a R library that is used to create interactive graphics. The underlying logic is similar to ggplot2, although its syntax is different. ggvis works on top of dplyr which makes the connection between data manipulation and plotting easier.

Every ggvis visualisation uses the function ggvis(). The first argument in this function is the dataset being used, in the format of a data frame, and the rest of the arguments specify how the data will be mapped to the visual properties of ggvis (for example, which field goes to the x-axis).

ggvis(mtcars, x = ~wt, y = ~mpg)

To assign vectors and not single values on visual properties, the R formulas must be used. The example above, although a ggvis visualisation object, doesn’t include any information about how to plot the data. In order to give that information we need to specify a layer. ggvis offers many options, like layer_points(), layer_bars(), layer_lines(), layer_text() etc. These functions, take as a first argument the ggvis visualisation object. More arguments that specify other visual properties can be given here.

vis <- ggvis(mtcars, x = ~wt, y = ~mpg)
layer_points(vis, fill := 'grey')

3_initial_ggvis_example
Other layers include:

Layer Description
layer_points() produces a scatter plot
layer_lines() produces a line plot, connecting all the given data points
layer_paths() produces lines if fill is empty, and polygons if it is set to a value
layer_bar() produces a bar plot
layer_text() produces text on the specified

These layers, although a subsets of the layers available in ggplot2, are enough to create most of the visualisations.

We already mentioned before that ggvis works on top of dplyr. So, it makes use of the %>% pipe operator from dplyr. All the ggvis functions, take as a first argument a ggvis visualisation object, that has probably been already created by another ggvis function. In order to simplify the syntax we can use the pipe operator to give the first argument to the next function, and avoid having many nested calls:

ggvis(mtcars, x = ~wt, y = ~mpg) %>% layer_points(fill := 'grey')

Another advantage of having this operator available is that we can use dplyr within ggvis calls.

Tooltips with ggvis

The main reason that ggvis plots are more interactive that ggplot plots is the tooltip functionality. ggvis plots allow us to create tooltips that will be triggered from an event, either by hovering over an item or clicking it. These tooltips can be extremely useful to display extra information that couldn’t fit the initial plot, or an interpretation of what the plot demonstrates.

Tooltips are being added in ggvis plots by the function add_tooltip(). This function takes three arguments:

  • The ggvis visualisation.
  • A function that takes a single argument x and return the HTML tooltip to be displayed. The argument x is a one-row data frame created by ggvis that represents the mark that is currently under the mouse. The HTML return value should either be a string that contains some functional HTML code, or NULL in case we don’t want a tooltip to appear.
  • The event that we want to trigger the tooltip. It can take the values "hover", "click" or c("hover", "click").

Function for the example above:

tooltip <- function(x){
  tip <- paste0("Wt: ", x$wt, "<br> Mpg: ", x$mpg)
  return(tip)
}
ggvis(mtcars, x = ~wt, y = ~mpg) %>% 
  layer_points(fill := 'grey') %>% 
  add_tooltip(tooltip, "hover")

The argument x of the tooltip function is a data frame that contains information about the mark under the mouse. This information normally includes the x and y axis values, the render details such as colour or shape if there are any, and any grouping information available. Within mini-apps we often want to display additional information that exists in the initial data about the current data point, but it is not included in the x argument. Because of that it is a useful practice to create the tooltip function inside the shinyServer function, so that we have access to the initial data.

Creating our map with ggvis

With a better understanding of the ggvis plotting library, we can proceed to creating our map:

## Create ggvis interactive map  
vis <- reactive({

    # Get seelcted data
    selected_data <- add_palette()

    # Create the ggvis object
    selected_data %>%
      arrange(order) %>%
      group_by(ccg_code) %>%
      ggvis() %>%
      layer_paths(x = ~long, y = ~lat, fill := ~col_code) 
  })

Essentially we are creating a map plot by ordering the data points, and grouping them by their CCG code. Using the path layer we are assigning the longitude and latitude variables to the x and y axes respectively, and we are colouring the groups by the colour variable we calculated earlier. Now let’s create a tooltip function with some additional information:

## Tooltip function. x is the ggvis object that is currently triggering the tooltip
  map_tooltip <- function(x) {

    # Return an empty tooltip if x is empty
    if(is.null(x)) return(NULL)

    # Retrieve the code of the ccg that triggered the tooltip
    selected_ccg_code <- x$ccg_code

    # Get selected data
    selected_data <- add_palette()

    # Filter ans summarise the selected data according to the retrieved CCG code
    ccg <- selected_data %>%
      filter(ccg_code == selected_ccg_code) %>%
      group_by(
        ccg_code, 
        ccg13nm
      ) %>% 
      summarise(
        items = round(mean(items_perc, na.rm=T), 2),
        avg_cost = round(mean(act_cost_perc, na.rm=T), 2)
      ) 
    ccg <- unique(ccg)

    # Create the HTML code for the tooltip
    tip <- paste0("CCG: ", ccg$ccg13nm, "<br>",
                  "Items (%): ", ccg$items, "<br>",
                  "Avg cost (£): ", ccg$avg_cost, "<br>"
    )

    # Return tooltip
    return(tip)
  }

and assign this function to our visualisation:

## Create ggvis interactive map  
vis <- reactive({

    # Get seelcted data
    selected_data <- add_palette()

    # Create the ggvis object
    selected_data %>%
      arrange(order) %>%
      group_by(ccg_code) %>%
      ggvis() %>%
      layer_paths(x = ~long, y = ~lat, fill := ~col_code) 
  })

In order to make this visualisation appear in the UI, we need to bind the visualisation in the UI:

  ## Bind interactive map to the UI
  vis %>% bind_shiny("map_plot")

This has to be placed in the shinyServer() function, but not inside a reactive component. The reason for that is that when we bind a ggvis plot in the UI, then a placeholder and some javascript code to generate the plot are being created. When a widget changes, only the updated data will be sent to the UI.
Finally, we need to modify our shinyUI() function to include the map plot:

## ui.r
library(shiny)
library(ggvis)
shinyUI(fluidPage(
  titlePanel("Prescribing data London"),
  fluidRow(
    column(width = 3,
           wellPanel(
             sliderInput("year", 
                         "Select year to visualise: ", 2011, 2013, 
                         value = 2011, 
                         animate = F),
             br(),             
             selectInput("var", 
                         label = "Select variable to visualise", 
                         choices = list("% Items" = "items_perc", "Average cost (£)" = "act_cost_perc"), 
                         selected = "items_perc"),
             br(),
             radioButtons("drug", 
                          label="Select the drug to visualise",
                          choices= list("Diabetes drugs" = "diabetes", "Anti-depressant drugs" = "antidepressants", "Statins" = "statins"),
                          selected="statins")
           )
    ),
    column(9,
           wellPanel( 
             htmlOutput("map_title", inline=F),
             ggvisOutput("map_plot")
           )
    )
  )
)
)

4_final_presc_map_ldn

After that we can start using our mini-app with the interactive maps. The mini-app widgets in combination with the ggvis map plot provide us with many different choices and a lot more additional information that cannot be shown in a regular plot. So ggvis can help us add even more interactivity to our visualisations, take our analysis one step further and allow the final user do part of the analysis themselves.

Further reading