Using Tableau Table Extensions for Advanced Analytics

Every state department of education and most school districts have analysts of one kind or another. These may include data analysts tracking attendance and enrollment trends, monitoring test performance for district and school rankings, or psychometricians analyzing trends at the item level. Oftentimes these analysts rely on statistical models written in languages such as R or Python to provide results and these results must be translated to individuals in leadership positions who need to know the results but do not have a background in statistics. This provides a significant challenge for the analyst, they must operationalize their models in a database and then translate the results for consumption – this may be done in Tableau for easy consumption of the information but the disconnect requires loading scripts into Tableau and then running do loops that are then refreshed at regular intervals.
The challenge here is that the model and the visualization are sitting on two separate systems, Therefore, when there are questions about the model or something needs to be changed, the analyst must work out of two different systems and reload the scripts. This also does not speak to the ability to interpret the results and make complex information less complex. Fortunately, Tableau has a solution for this in the 2022.3 release – Table Extensions.
Table Extensions allow the analyst to write their models in R or Python directly in Tableau. Below I detail how to implement Table Extensions in Tableau and provide responses to some common questions. First, we must ensure that Tableau is connected to the analytics extension. To do this, go to ‘Help > Settings and Performance > Manage Analytics Extensions.’ The Hostname is ‘locahost’ and the the ports are:
- Port 6311 is the default port for plaintext RServe servers.
- Port 4912 is the default port for SSL-encrypted RServe servers.
- Port 9004 is the default port for TabPy.
We are now ready to work with the Table Extensions. We begin by loading the data in Tableau as we normally would and then drag the Table Extension as a relationship with our dataset. We then drag the dataset we wish to analyze to the Table Extension.
We then specify the dataset in the Table Extension work pane using ‘_arg1’ for Python or ‘.arg1’ for R. Note, we must specify this as a dataframe and this can be done in several ways. Using R as an example, this could be:
data <- data.frame(.arg1[1:length(.arg1)]) to pull in the entire dataframe, or;
df <- function(df) { data.frame(variable_1 = as.string(column_a), variable_2 = as.factor(column_b)… ) } if we want to bring in specific variables for analysis. We would then specify the function for the .arg1.
data <- df(.arg1)
We can now build our models in the Table Extension. In the example below I am implementing a structural topic model to assess word usage on open ended survey responses, though we can implement any model we would run in R or Python, i.e., regression models, classification algorithms, Rasch models, etc.
Note that the outputs from the model are now included in the dataset and the Table Extension is now a new relationship. Note also that we connect the relationship as usual so it is helpful to have the data pre-indexed before loading. We can use these outputs the show the results of the model in an easily understandable fashion.
Now let’s look at an example using an IRT model assessing the Verbal Aggression dataset: A data frame with 316 participants and 27 variables measuring how individuals use aggressive language to control conversations. This is a classic dataset for teaching item response theory for psychometricians.
We first specify the transformed data to be used in the model using the eirm library in R:
data1 <- polyreformat(data=VerbAgg, id.var = “id”, var.name = “item”, val.name = “resp”)
We then specify the model and save the random effects for the respondent ID to be analyzed in Tableau:
mod1 <- eirm(formula = “polyresponse ~ -1 + situ + btype + mode + (1|id)”, data = data1)
theta <- ranef(mod1$model)$id
data <- cbind(data, theta)
We can now join this with the original data for analysis in Tableau. From here we can determine how individuals respond to questions of whether they have used aggressive language, want to use aggressive language, and how they do, or would like to, go about doing so.
Note: The code used in this example comes from Okan Bulut (2020)