TIME SERIES PREDICTION

Time series forecasting is a specific branch of data science. It is different enough and interesting enough to deserve a page of its own.

The Basics

There are many practical applications such as restaurant stock levels, demand for hospital beds and road traffic volumes. Compared to more mainstream regression problems, the nature of a time series problem introduces some unique constraints -

Time series prediction is autoregression, meaning that we're using earlier values of the target variable itself when predicting the latest entry
By default, autoregressive solutions tend to focus only on the autocorrelation in the target variable. But in many real world applications there are other features that could improve the accuracy. For example, predicting tomorrow's road traffic volumes could benefit significantly if we know what the weather will be like. In historical data, we can determine the correlations of traffic patterns with corresponding rainfall, temperature and wind speed data, to see if more commuters prefer to drive if it is raining. But introducing these other features comes with a few challenges -
- How to auto-regress on both the target variable and the other predictive features at the same time?
- Choosing only predictive features that we can expect to have reasonable future predictions for. Weather forecasts could be used to improve predictive accuracy. Of course, road accidents will affect traffic volumes severely, and we can calculate the significance of their historical impact. But if we have no reliable way to predict future accidents then there is no use in building them into the model. Even predictions that use weather forecast data will depend on the accuracy of the weather forecast.
- Formatting of the training data. It is simple enough to do auto-regression on a single target variable. But introducing more features requires using a history of these features in model training. This significantly increases the dimensionality of the training data.
Standard models are usually evaluated by splitting the raw data into a training set and a testing set. But with time series prediction the future data is usually not available yet, so the standard way to measure model accuracy is also different. We usually train the model on a rolling window of historical data, and then measure the accuracy on more recent, but still historical, data. This is called rolling window validation.

Packages

There are several statistical and AI packages to deal with time series autoregression. In the age of AI there is a temptation to promote the machine learning ones, although purely statistical ones might be just as effective depending on the circumstances -

Separating trend, seasonal effects and residuals

seasonal_decompose, part of the statsmodels library in Python, offers a convenient interface and the option to handle components as additive or multiplicative

Statistical packages that cater only for the time series itself

SARIMA (Seasonal AutoRegressive Integrated Moving Average) models separate seasonal component, de-trends the series by differencing and models the residual as a linear combination of its own previous values
Prophet, originally developed by Facebook, offers a hugely flexible, intuitive modelling interface to get fairly reliable predictions with minimal effort. Prophet can allow for correlations with other predictive features than the target variable, but supports only linear correlations, ruling out effective modelling of more complex interactions.

Statstical packages that handle other features out of the box

Vector Autoregression (VAR), also from statsmodels in Python, is a multivariate statistical model used to capture the linear interdependencies among multiple time series. The advantage is that it handles the multi-dimensional nature of the regression problem under the hood and avoids complex re-formatting of the training data, but like Prophet, it is limited to considering linear interactions between lagged values of the target variable and other features.

Machine Learning out of the box

A Sequential model from Keras, part of TensorFlow, can handle the impact of previous values using Long Short Term Memory (LSTM) layers or Gated Recurrent Units (GRUs)
TimeGPT from Nixtla is an LLM-based approach to time series trained on vast volumes of chart data

Framing autoregression as normal regression

It turns out that it is possible to re-format autoregression data to make it available to some of the powerful mainstream regression algorithms such as the boosting family, XGBoost, LightGBM and CatBoost, to benefit from their flexibility to handle non-linear interactions with multiple features. One way is to create lagged features where every row of data has columns for the current target variable, the current features, as well as columns for all the historical target and feature values going back n steps, with the modeller selecting the appropriate level for n.

It all depends on the lens!

In summary, the biggest challenges in tackling time series regression are -

Choosing the right lens (auto-regression, mainstream regression, which predictive features), and
Choosing how to separate trends and seasonality

What is time series modelling less good at?

In spite of the hype, time series modelling doesn't handle emerging trends well. For example, when analysing customer search patterns for Nachos recipes it will be useful to capture data for sub-trends such as 'Summer 2024 craze for Nachos with dill' (who knows if this will become a thing!) These trends are patters in human behaviour. They are neither seasonal nor is it possible to know in advance how many sub-components (sub-trends) there will be or how long each component will persist for.

Do not use traditional time series models in applications like these.

Of course, many new problems from the past eventually get solved. It is quite possible that an analytical package could handle trend phenomena better soon.

Application: Prophet and Visual Crossing

The app below is trained on historical traffic volumes for a busy road in the Western Cape region of South Africa. Time series autoregression was fitted using Prophet from Facebook.

For comparison, three more mainstream, emsemble-based models were also used:

XGBoost
LightGBM from Microsoft and
CatBoost from Yandex

To perform a basic comparison, all of the models were fitted with out-of-the-box hyperparameters.

To improve the accuracy of the forecast, historical and predicted weather data was / is sourced from the Visual Crossing Weather API. The accuracy was calculated using rolling window validation.

Since direct autocorrelation and lagged feature prediction are so different, it might be reasonable to expect a clear winner in predictive accuracy. In particular, we might expect Prophet and the ensemble models to perform differently. But as the chart below shows, at a basic first glance the predictions were all reasonable, better than 85%, and it's hard to see a clear winner among them.

The model that follows is based on Prophet but any of the others could have been used.

This app automatically sources weather predictions from the Visual Crossing API. It also allows for traffic differences on Normal Days, Weekends and Special Days (public holidays and other, non-standard days such as school holidays).

Although special days and weekends normally have lower volumes, the drop compared to normal days is less pronounced for this particular traffic location which tends to be busy most of the time.

To get a new prediction, select which, if any, of the next seven days are Special Days and click Refresh.