Getting Started with the R Console in the AnalytiXagility Platform

February 26, 2014 | Alan

This blog introduces you to basic AnalytiXagility platform commands, covering the use of the R console within the platform.

Related Blog Posts

Data Source

The data used in this example is the simd data, which you can find in the aridhia_opendata workspace in the AnalytiXagility platform. The Scottish Index of Multiple Deprivation (SIMD) identifies small area concentrations of multiple deprivation across Scotland in a consistent way. The simd data used for this tutorial can be found here. Note however that this dataset has been modified for use from the original source, available from Scottish Government website. You can find more information about the open data sets available and how to request access to them here.

Learning Outcomes

This tutorial explains how to do the following in the AnalytiXagility platform:

  • Open the R console.
  • Read in a table from a database.
  • Use the built-in AnalytiXagility functions within the R console.
  • Create a plot.
  • Write an R script.
  • Write an RNW script.

Workflow

This blog assumes that you have logged into your system and have navigated to the Workspaces page. If you are unsure how to do this, see Day 1 in the AnalytiXagility Platform.

Step 1 – Open the R console within the AnalytiXagility platform:

Go to the R Tools tab at the top right corner of the aridhia_opendata workspace:

Screenshot1

Select the Start Console tab – this opens up the R console within the AnalytiXagility platform:

Screenshot1.1

Display a list of the built-in functions available in the platform by typing ls() into the console:

Screenshot2

Use xap.help() to retrieve information about a specific function. The code chunk below demonstrates applying xap.help() to xap.list_tables():

 xap.help(xap.list_tables)
[1] "The xap.list_tables function takes no args and returns a list of table or database view names"

You can see a record of all the commands generated in the course of the session in the R console by selecting the History tab:

Screenshot2.2_V2

Step 2 – Reading in data

Use the xap.list_tables() function to see a list of tables and database views in the workspace. The code chunk below demonstrates how to use this function:

xap.list_tables()
[1] "simd" "dataset1" "dataset2" "dataset3"

Use the xap.table_exists() function to check for existing table or database view names; it returns TRUE if the table or view exists in the AnalytiXagility platform workspace:

xap.table_exists("simd")
[1] TRUE

The simd data is read into the R console from the database using the function xap.read_table(), shown below. This function takes either a table or database view from SQL and converts it into a data frame:

data <- xap.read_table("simd")

The xap.list_fields() function takes a table or database view name and returns a list of fields:

xap.list_fields("simd")
[1] "id" "izname" "simd2009" "simd2012"

Use the xap.sample_data() function to select a sample of a table or database view name. The main input to this function is a table or database view name, but you can optionally specify a size which should be a positive integer (N). The first N records of each field are returned in the form of a data frame. The default size is 10:

xap.sample_data("simd", 10)

Step3 – Creating a plot

Use the data frame from Step 2 to create a plot:

plot(data$simd2012 ~ data$simd2009, main = "Scatterplot of simd2009 versus simd2012")

You can view the generated plot by selecting the Plots tab, as shown in the screenshot:

Screenshot4
For more information on generating basic plots, see Basic Charts.

The xap.last_plot_as_image() function saves the last plot generated as a PNG or PDF in the working directory and returns the file name:

xap.last_plot_as_image()
[1] "Rplot006.pdf"

The xap.last_plot_as_image_named() function saves the last plot generated as a PNG or PDF in the working directory and returns the file name. You must supply a file name for the image:

xap.last_plot_as_image_named("simd-scatterplot")
[1] "cachedImage_simd-scatterplot.pdf"

Step 4 – Generating an R script

The previous two commands can also be executed from a script, which must be stored in a file with a file type of “.r”. To write an R script:

  1. Go to the Workfiles tab. Click on the New tab and select .r from the drop-down menu.
  2. Copy the required command lines from the console (see code chunk below) and paste them into the script.
  3. Click on the Save tab to save the current version, or the Save As to save as a new version.
data <- xap.read_table("simd")
plot(data$simd2012 ~ data$simd2009, main = "Scatterplot of simd2009 versus simd2012")

Note that there must be text in the R script before it can be saved. For the purposes of this blog, the script has been named as blog_script.r.

 

To execute the R script return to the console. Click on the Workfiles tab, filter by R Scripts and the newly-created script is listed here:

Screenshot6

To execute the required script, hover over the .r on the far right hand side until the arrow appears as shown in the screenshot. Click on the arrow. The following command will appear in the console:

xap.source("blog_script.r")

Step 5 – Generating a LaTeX PDF report

To generate a LaTeX PDF report, firstly create an RNW script. This combines two components:

  • the LaTeX code to write a document
  • the R code and output that is to be included in the document.

To create an RNW script:

  1. Go to the Workfiles tab. Click on the New tab and select .rnw from the drop-down menu.
  2. Write the RNW script. Example code is shown in the code chunk below.
  3. Click on the Save tab to save the current version, or the Save As to save as a new version.

For the purposes of this blog, the script has been named as blog_pdf.rnw.

documentclass[a4paper]{article}
begin{document}

<<plot1, comment="", echo=FALSE >>=
data <- xap.read_table("simd")
plot(data$simd2012 ~ data$simd2009, main="Scatterplot of simd2009 versus simd2012")
@

end{document}

To execute the RNW script, return to the console. Note that the R library knitr is required to run an RNW script. Click on the Workfiles tab, filter by Report Scripts and the newly created script is listed here:

Screenshot_rnw_list

To run the script, hover over the .rnw on the far right hand side until an appears as shown in the screenshot. Click on the arrow:

library(knitr)
xap.knit("blog_pdf.rnw")

The xap.knit() function generates a PDF report from an RNW file. Press Return to run the RNW script and generate the PDF report. The newly-generated report appears under the Reports tab, as shown in the screenshot:

Screenshot_reports_tab

To view the new PDF report, click on Rreport000.pdf:

Screenshot9

Step 6 – Other built-in functions

  1. The xap.image() function makes an image workfile (for example a PNG file), available in the working directory for use in reports.
    xap.image("aridhia.png")
    
  2. xap.conn() is a Database connection object for the AnalytiXagility platform workspace. It is used in the same way as a connection created using dbconnect. See the R package RPostgreSQL documentation for more information.
  3. xap.db.sandbox() is a variable defining the platform workspace database schema name.
  4. xap.debug_wd() is for troubleshooting only. It lists the files in the working directory.
  5. xap.db.user() is a variable defining the role used to connect to the platform workspace data.

What’s Next?

This blog has covered the basics of the built-in commands within the platform. Other posts will look in more detail at:

  • The different types of data in R.
  • Useful functions for data management.
  • Creating plots in R.
  • Using LaTeX to write a report of your analysis.

 

Leave a Reply

Your email address will not be published. Required fields are marked *