February 26, 2014 | Alan
Our data science team have outlined some tips and tricks for using the AnalytiXagility platform, enjoy!
Use shortcuts CTRL “+” and CTRL “-“ to zoom in and out of the screen, respectively.
Use F11 to toggle between AnalytiXagility full screen/normal screen.
AnalytiXagility is web based and you should use that to your advantage. Open multiple tabs in different areas of AnalytiXagility to quickly compare data, scripts and work done in the SQL editor or R console.
- Each workspace contains three tabs – Summary, Data, Workfiles.
- Summary – High level audit trail of work done in workspace.
- Data – Contains datasets available in a workspace. Clicking on a dataset will allow you to view your data, perform some of the built in visualizations and use the point-and-click SQL commands. The point-and-click SQL commands include SELECT, WHERE, JOIN, CREATE TABLE, and CREATE VIEW. Access the R Console from here.
- Workfiles – Contains all files uploaded and created in AnalytiXagility. Upload and create files here. Access the SQL editor by opening/ creating a SQL script – this will take you to the SQL editor. You can also access the R Console from within this Workfiles tab.
Built in visualizations
Open your dataset from within the Data tab and use the built in visualizations to explore your data. The most common visualizations are available with the Visualise button you might need to transform your data for some of them to work properly. If you want to transform or analyse data there are two main paths: SQL and R.
Point and click SQL analysis
Open your dataset from within the Data tab and use the point-and-click SQL commands to explore your data. Derive a custom view allows you to interactively SELECT, filter (apply WHERE command), and add a JOIN between two datasets with the click of a button. You also have the ability to preview the SQL code behind your mouse clicks. Click on Create database view to copy your custom view over to the GreenPlum database – you will need to do this to access your view from the R console.
Where to find SQL/R editor and how to use it.
There are two main places that editors can be found, the Workfiles and the R console. For running SQL queries, Workfiles is the place to start and for running R analysis scripts it’s the R console. You can get to workfiles from the Workfiles tab; there you can either create a new SQL file, or continue working on an existing one. SQL files are automatically executed against the schema of your workspace so you don’t need to worry about that. In the R console you will have access to all the existing R scripts in the workspace, as well as the ability to create new ones from the right-hand panel.
Keep in mind that although you can view and edit files in both locations: You cannot execute SQL scripts in the R console. You cannot execute R scripts in the Workfiles Tab.
You may still edit and preview their content anywhere, but it is recommended to go to the R console for R analysis, and to the workfiles/SQL scripts for SQL queries.
Executing code while in the Editor (SQL/R)
The SQL editor currently gives you the option to execute an entire script or a selection of commands straight from the editor. Whilst all the commands will be executed, only the results from the final SELECT command will be displayed on the screen. If your SQL command isn’t returning results to the screen eg UPDATE, then the query will execute but there is no confirmation when this has been done – you no longer see the Executing text so you know its complete.
Unfortunately this is not the case for the R editor, simply because there is no guarantee that there will be an active R session at the time. When in the R console you can either execute the entire script straight from the list (hover over the file name and a Run in console button will appear ) – we should add in a screenshot of this – or copy and paste from the script tab to the Console tab.
Useful R functions
ls() into the R Console will print to screen a list of the built in xap commands. Typing
xap.help(nameofxapcommand) will print supporting documentation to screen.
Write a dataframe to the database: xap.db.writeframe()
Write a dataframe to a workfile: xap.save_workfile()
(This function will only be available on approval, by default it will NOT be available to use)
library(xapdiabetes) #write to xml then save to a Workfile write.xml(sample_data, file="sample_data.xml") xap.save_workfile("sample_data.xml") #write to csv then save to a Workfile write.csv(sample_data, file = "sample_data.csv") xap.save_workfile("sample_data.csv")
Saving a dataset and exporting & loading onto another XAP instance (requires xap.save_workfile())
library(xapdiabetes) #write to RData then save to a Workfile save(sample_data, file = "sample_data.RData") xap.save_workfile("sample_data.RData") Download this Workfile (using the Download button) Upload this Workfile to a different XAP workspace xap.workfile("sample_data.RData") load("sample_data.RData") # loads up sample_data # this will display the contents of the previously saved sample_data data.frame
Plotting a scatter plot in XAP. Using the base plot function:
library(ggplot2) plot(x=mtcars$wt, y=mtcars$mpg, main="Scatterplot Example", xlab="Car Weight ", ylab="Miles Per Gallon ")
Produce the same plot using
ggplot(data=mtcars, aes(x=wt, y=mpg)) + geom_point() + xlab("Car Weight") + ylab("Miles Per Gallon") + ggtitle("Scatterplot example")
Plotting a line plot in XAP
library(ggplot2) ggplot(data=mtcars, aes(x=wt, y=mpg)) + geom_line() + xlab("Car Weight") + ylab("Miles Per Gallon") + ggtitle("Line plot example")
Plotting a bar chart in XAP
library(ggplot2) ggplot(data=mtcars, aes(x=wt)) + geom_histogram() + xlab("Car Weight") + ylab("Count") + ggtitle("Histogram plot example") library(ggplot2) ggplot(data=mtcars, aes(x=mpg)) + geom_histogram() + xlab("Miles Per Gallon") + ylab("Count") + ggtitle("Histogram plot example")