Getting Started with Time Series Analysis in Tableau
This post was developed to accompany a Tableau User Group presentation for Michigan State University on July 7, 2022.
Working with date and time data is one of Tableau’s great strengths. When authoring a viz, you can quickly modify how you would like to treat a particular date/time field and rapidly change the type of analysis possible. Through drag-and-drop actions, you can also implement predictive forecasting to peer into the future. However, the flexibility of time series analysis options in Tableau can sometimes be overwhelming, particularly for a new Tableau developer. This post provides an introduction to working with date/time fields in Tableau and how to get started using Tableau’s drag-and-drop forecasting functionality to predict future values.
Images included in this post are drawn from this workbook, which can be downloaded and used to follow along.
For this post I’ll be working with a subset of data from Dartmouth University’s StudentLife Study. The Dining Hall data contained in 31 text files was unioned together using Tableau Prep Builder.
The resulting file (downloadable here) contains transactional records with a date and timestamp (down to the second) for dining hall visits at one of six dining halls on Dartmouth University’s campus during the spring semester of 2013. In total, the data contains 7,482 transactions across 30 students.
While the dataset is narrow — containing only 4 fields — the inclusion of a date/time field greatly extends the questions that can be asked of the data, as I’ll explore in the post.
Changing Date Levels
Because of Tableau’s flexibility when working with date/time fields, it is easy to get turned around without an orientation to the basic terminology. Therefore, it is important to know a few key terms before diving in.
First, the process of modifying how a date/time field is treated in a view is considered changing its level (see link for more details).
Tableau has three categories of level changes for date/time fields: maximum temporal grain, truncated date/time values, and date/time parts. You can see the way in which these levels are categorized when you expand the details of a date/time field on a view.
Exact Dates: Maximum Temporal Granularity or Smallest Grain of Date/Time Data
With basic nomenclature out of the way, next I’ll detail the three levels using an analysis scenario to provide more clarity about the differences between them.
Let’s say I’m interested in answering the question, “When did students visit dining halls most often during the 2013 semester”? At the deepest level of detail, or maximum temporal grain (the area bordered in green in the previous figure), I view a timeline — moving from left to right — of exactly when student visits to dining halls occurred at a particular point in time:
The resulting bar chart isn’t very impressive. At first glance, it even appears that something may be wrong with the data: there are only 3 points in time where two students visited a dining hall at the same time? How can this be?
But this view is accurate. Our data’s grain goes down to the individual second in time. This means that out of the 7,400+ transactions in our dataset there were exactly 3 times when two students had dining hall visits recorded in the same second. In the remainder, only 1 transaction occurred at a specific second in time, if at all.
When a date/time field’s level is set to “Exact Date” we view the smallest grain of the dataset: exactly when a student visited a dining hall by second. However, is this really what we’re after in our original question? Maybe not.
This example illustrates the challenge language has when it comes to time series analysis questions. We tend to speak in generalities that may have inferred, contextual meaning, i.e., “when” is a very squishy term by itself. Analytics tools require us to be more specific and discerning when focusing on “when”.
Truncated Date Values: Working with Segmented Spans of Time
Often, we aren’t interested in viewing the deepest grain of the data; it is too fine to derive must meaning. The same is true for time series analysis.
Instead, we frequently are interested in aggregating or truncating time series data to a higher level of temporal granularity more meaningful to our intended concept of “when”. But, we have to instruct Tableau on exactly what chunk of time we are interested in seeing by defining its level.
For example, it might be more informative to add up the total transactions that occurred by day for each day in the semester. By changing the field’s level from exact date to truncated value by day (yellow bordered area in the figure above), we see something more along the lines of what might be inferred in the initial question.
In this case, truncating at the day level lets us identify longer-term trends over the semester (trending upwards with a spring break pause) and shorter term weekly cycles (higher weekday than weekend totals).
We can truncate data to more course units, such as by month. At the month temporal grain, we readily see the upwards trend of activity over the course of the semester, rapidly dropping off in June. However, we lose all visibility into the weekly cyclical rhythm in our previous viz. It’s a trade-off and dependent on the question of interest.
What’s impressive is that the three views use same date/time and count fields on the row and column shelves: we’ve simply told Tableau that we want to visualize the information at different level of time, depending upon the intent of our question.
Date Parts: Grouping Discontinuous Dates and Times
Tableau’s flexibility to quickly modify how date/time information is visualized doesn’t stop at modifying the truncated level of the data over a stream of time. We can also analyze parts of time.
For example, the question of when students visited dining hall most often could also be interpreted from a non-linear perspective. We might want to know which days of the week had the highest and lowest transaction totals, e.g. Wednesdays vs. Saturdays.
Rather than grouping transactions by a defined span of linear time, we want group together similar units of time that aren’t necessarily adjacent to one another, e.g., all transactions that happened on a Monday, regardless of when it fell in the semester. In this approach, we are changing the level of the field to a date part.
As with truncated values, we can move up and down in the temporal granularity of the data with date parts. For example, it could also be valuable to know which hours of the day are typically busiest to help plan staffing and food quantity preparation. Rather than day of week, we group by hour of the day for all days in the dataset:
It is clear that noon and 6:00 PM hours were peak activity hours over the course of the semester.
Combining Date Parts and Truncated Values to Generate New Insights
So far, we’ve examined a single date/time field in a view. Additional insights can be uncovered if you combine multiple fields, each with a different level, in a single view. For example, by taking our hours of the day date part and combining it with a truncated by week value, we can understand how peak hours varies over the course of the semester.
This viz suggests that after spring break in Mid-March, the peak dining hours indicated by orange/red color shifted to be an hour earlier for lunch (from noon to 11:00 AM) and an hour later for dinner (from 5:00 PM to 6:00 PM). The trend stays consistent for most of the remaining weeks of the semester. If this trend has occurred before, knowing about it may save a future headache next year.
Predicting the Future: Using Date/Time Forecasting
The previous examples provide descriptive insights to understand past dining patterns at Dartmouth. The limitation is that patterns and trends are backward looking. It might be more helpful if we could account for the factors and trends we know exist in the data (hourly or weekly cycles, for example) and project a forward looking prediction in our view: a forecast prediction.
There are several methods of making predictions using advanced statistical models and toolsets. However, developing these models often requires specialized skills and significant investment of time. If you don’t have these skills, you could certainly learn and integrate them to your dashboards, but this may be unrealistic for many Tableau users.
Alternatively, Tableau provides an efficient, out-of-the-box solution to apply forecasting. While the setup is fast, it requires an understanding of Tableau’s date/time fields for the drag-and-drop forecast option to appear (some treatments of date/time field levels work with forecasting and others won’t). Fortunately, if you’ve read the preceding content on date/time field levels and manipulation in Tableau you are already in good shape.
Forecasting in Tableau: A Simplified Overview of How it Works
Tableau uses Holt-Winters exponential smoothing modeling for its forecasts. Tableau’s documentation states that, “Forecasting algorithms try to find a regular pattern in measures that can be continued into the future….Exponential smoothing models iteratively forecast future values of a regular time series of values from weighted averages of past values of the series.”
Under the hood, Tableau performs a tournament among up to 8 models with varying trends, seasonality, and temporal granularity (for a list of several of the models considered, go here). Ultimately, Tableau displays the results of the model with the highest quality metrics in its forecast visuals.
How to use it?
For the author, adding a sophisticated forecast to a view is a simple drag-and-drop operation, provided that you have the right view prerequisites. This is where users may encounter sticking points.
The ability to use forecasting in a viz is dependent on what date/time fields you have in a view and how you have told Tableau to interpret those fields. Tableau’s documentation states:
“Tableau supports three types of dates, two of which can be used for forecasting:
- Truncated dates [or date values] reference a particular point in history with specific temporal granularity, such as February 2017. They are usually continuous, with a green background in the view. Truncated dates are valid for forecasting.
- Date parts refer to a particular member of a temporal measure such as February. Each date part is represented by a different, usually discrete field (with a blue background). Forecasting requires at least a Year date part. Specifically, it can use any of the following sets of date parts for forecasting:
- Year + quarter
- Year + month
- Year + quarter + month
- Year + week
- Custom: Month/Year, Month/Day/Year
Other date parts, such as Quarter or Quarter + month, are not valid for forecasting. See Convert Fields between Discrete and Continuous for more details about different date types.
- Exact dates refer to a particular point in history with maximum temporal granularity such as February 1, 2012 at 14:23:45.0. Exact dates are invalid for forecasting.
It is also possible to forecast without a date. See Forecasting When No Date is in the View.”
In summary, if you want to add a forecast to a viz but aren’t seeing the “forecast” option in the Analytics Pane it is likely that one of the above requirements is not being satisfied, e.g., you are visualizing the date part of ‘Month’ in a view without including a ‘Year’ field in the view.
Example Forecast in Action
Let’s say our goal is to provide a daily prediction of dining hall transactions to assist with staffing and meal preparation. Perhaps we’re mid-way through the Spring semester and we’d like to predict activity by day starting April 1, 2013 (I’ve simply filtered out the dates after April 1, 2013). Because we are using a truncated date value level in the view, applying a forecast is a simple drag-and-drop procedure (see bullet one above).
Yes, it really is that easy:
Behind the scenes, Tableau compares model variations and selects the result with the highest computed quality. This all happens automatically and is applied to the viz but, if needed, a developer has options to modify how the forecast is displayed, for example the forecasting length, inclusion of confidence intervals, etc.
You can also take manual control and define aspects of the forecast model used, though not all models result in a valid forecast, as the animation shows:
So that’s it – you now know how to easily add a forecast to a view. But, it is unlikely that all situations will involve a situation where a single truncated date value is in the view.
As an example, let’s say instead of truncating by day you start with the month date part. At this point, forecasting is disabled in the analytics pane with little explanation — it’s simply greyed out.
The resolution is described by bullet two in this case: if we are trying to forecast by the month date part we must also include a year date part field somewhere in the view.
As soon as we have this combination, forecasting is enabled for drag-and-drop addition. But, we’re not done and a closer inspection is still needed. Without a bit of reflection on what is shown, a poor decision is likely and you may unfairly place the blame on Tableau.
Clearly, we know at this point that the semester ends in June and the dining hall activity drops off. But, the forecast is indicating strong activity for both June and July. Does this mean forecasting in Tableau unreliable and inaccurate?
In this case, the limited data set does not provide Tableau with a long enough time horizon to detect the seasonal pattern occurring across months; this cycle is particularly critical given the month date part we are viewing. To address this we would need multiple years of dining data.
The moral is that we should never blindly trust the output of a forecast. Just because it can be done doesn’t mean it should be and human reflection is always needed. This leads into the next topic of other methods to quantitatively evaluate whether you should trust the results of a forecast.
Trusting the Numbers
With a few drags and clicks we’ve run our data through a set of forecasting models, fine-tuned how the model is configured, and displayed the results on our viz. But how do you know if the model fits the data well and if you should have much confidence in the forecasted results?
Forecast model statistics are accessed via the “Describe Forecast” sub-menu under Analytics > Forecast. Tableau provides some introductory documentation on interpreting the displayed metrics, but for the casual user, there are a couple of items of note related to forecast model quality.
Once open and displaying the summary tab, the last column indicates the quality of the model. The value displayed is either good, ok, or poor, all in relation to pre-calculated “naïve forecast”.
Tableau’s help documentation notes that a “naïve forecast is defined as a forecast that estimates that the value of the next period will be identical to the value of the current period. Quality is expressed relative to a naïve forecast, such that Ok means the forecast is likely to have less error than a naïve forecast, Good means that the forecast has less than half as much error, and Poor means that the forecast has more error.”
For a deeper quantitative view, the Models tab includes 5 quality metrics to interpret. These include the Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Scaled Error (MASE), Mean Absolute Percentage Error (MAPE), and Akaike Information Criterion (AIC). If you are looking for a refresher on what these statistics all mean the handy link at the bottom of the window interface will take you to a Tableau help article.
A Few More Bells and Whistles to Contextualize your Forecast
A slightly hidden set of options allows you to change what forecast result is included in the view. For example, you might want to highlight the quality of the forecast shown in order to guide end viewers about the significance they should place in the model. There are many ways to display the information depending on the need, from adding a metric in as a tool tip to including it as a displayed mark. Here is an example of a dual axis approach:
The green line displays the scaled Mean Absolute Scaled Error of the forecast, 100 *(1 – max(MASE, 0)), as described here. The closer the metric is to 100 the better the forecast. In this case, a quality value of 45 quantifies our “OK” label in the forecast description and doesn’t require multiple clicks to surface the information.
Additional Time Series Forecasting Resources
This post was intended as an introduction or refresher on the flexibility and power available when working with date/time data in Tableau. A simple, 4 field dataset reveals the range of options to analyze and forecast into the future using Tableau’s drag-and-drop interface. Much of the analytical functionality requires an understanding of modifying a date/time field’s level and how it is defined. This is all out-of-the-box, basic Tableau functionality a novice user can readily apply.
However, forecasting and modeling with time series data in Tableau doesn’t end here. For more advanced users, Tableau includes two table calculation functions for use in calculated fields. One in particular, Gaussian process regression, is useful when working with date/time data. More information can be learned here.
And, for expert users in need of custom modeling capabilities, Tableau also has analytics extension functionality. This allows you to script and send data in a view to a prediction service, such as R Server, TabPy, MATLAB, or Einstein Discovery, for example. The scored results can then be surfaced back in Tableau. More information about Tableau’s analytics extensions can be found here.
There is much more to learn in the knowledge domain of time series analysis. Not covered here are the differences between continuous (green pill) and discrete (blue pill) treatment of fields and the plethora of date/time functions available when creating a calculated field. You can learn more about those topics here and here.
Hyndman, R., & Athanasopoulos, G. (2018). Forecasting Principles and Practice. Otexts. Retrieved July 13, 2022, from Otexts: https://otexts.com/fpp2/.
Wang, Rui, Fanglin Chen, Zhenyu Chen, Tianxing Li, Gabriella Harari, Stefanie Tignor, Xia Zhou, Dror Ben-Zeev, and Andrew T. Campbell. “StudentLife: Assessing Mental Health, Academic Performance and Behavioral Trends of College Students using Smartphones.” In Proceedings of the ACM Conference on Ubiquitous Computing. 2014.
Helpful Links Referenced in Post:
(Ordered by appearance in above text)
Dartmouth University StudentLife Project Website: https://studentlife.cs.dartmouth.edu/
StudentLife Dining Hall Consolidated Dataset (csv): https://tableau.egnyte.com/dl/U3NgOELW2h/Dartmouth_Student_Life_Dining_Dataset.csv_
Tableau Help: Change Date Levels: https://help.tableau.com/current/pro/desktop/en-us/dates_levels.htm
Tableau Help: How Forecasting Works in Tableau: https://help.tableau.com/current/pro/desktop/en-us/forecast_how_it_works.htm
Forecasting Principles and Practice: A Taxonomy of Exponential Smoothing Models: https://otexts.com/fpp2/taxonomy.html.
Tableau Help: Forecasting with Time: https://help.tableau.com/current/pro/desktop/en-us/forecast_how_it_works.htm#forecasting-with-time
Tableau Help: Convert Fields between Discrete and Continuous: https://help.tableau.com/current/pro/desktop/en-us/datafields_typesandroles_convertdisctocont.htm
Tableau Help: Forecasting when No Date is the View: https://help.tableau.com/current/pro/desktop/en-us/forecast_no_date.htm
Tableau Help: Forecast Summary Tables: https://help.tableau.com/current/pro/desktop/en-us/forecast_describe.htm#forecast-summary-tables
Tableau Help: Quality Metrics: https://help.tableau.com/current/pro/desktop/en-us/forecast_describe.htm#quality-metrics
Tableau Help: Forecast Field Results: https://help.tableau.com/current/pro/desktop/en-us/forecast_field_results.htm
Tableau Help: Predictive Modeling Functions in Time Series Visualizations: https://help.tableau.com/current/pro/desktop/en-us/predictions_future.htm
Tableau Help: Pass Expressions with Analytics Extensions: https://help.tableau.com/current/pro/desktop/en-us/r_connection_manage.htm
Tableau Help: Dimensions and Measures, Blue and Green: https://help.tableau.com/current/pro/desktop/en-us/datafields_typesandroles.htm#blue-versus-green-fields
Tableau Help: Date Functions in Tableau: https://help.tableau.com/current/pro/desktop/en-us/functions_functions_date.htm#date-functions-available-in-tableau