Generating Autoregressive data for experiments

Jessica YungData Science, Machine LearningLeave a Comment

In this post, we will go through how to generate autoregressive data in Python, which is useful for debugging models for sequential prediction like recurrent neural networks.

When you’re building a machine learning model, it’s often helpful to check that it works on simple problems before moving on to complicated ones. I’ve found this is especially useful for debugging neural networks.

One example of a simple problem is fitting autoregressive data. That is, data of the form

x_t = a_0 + a_1x_{t-1} + a_2x_{t-2} + ... + a_nx_{t-n} + \epsilon_t,

where y_t=x_t.

The above example is called an AR(n) process. Basically, each datapoint depends only on the previous n datapoints and the distribution of noise \epsilon_t.

The process, then, is defined by
1. The coefficients a_0,...a_n,
2. The initial values of x, x_1,...x_n, and
3. The distribution of the noise \epsilon_t.

Generating realistic-looking time series data: Stable and unstable poles


The catch is that for your data to be reasonable or stable, these parameters have to satisfy certain conditions. The most important condition is that the AR process has to be stable. That is, the poles (the roots of the equation x^n - a_0x^{n-1}-a_1x^{n-2}-...-a_n) all have to have magnitude less than one. If not, something like this might happen:



So far so good, but…



…after fifty datapoints we’re well out of range! AR(4) process, poles are [-0.3+/-1.7j, -0.4 +/- 0.4j]. Noise var = 1.


We won’t go into detail as to why the poles have to have magnitude less than one here, but you can think of it like this: if you repeatedly multiply numbers with magnitude greater than one together, the magnitude of your end result will keep increasing. (If you want to learn more about poles, you can check out Brian Douglas’ videos on control theory.)

But if you have no noise and your poles have magnitude less than one, the data will converge to zero. Similar to the previous case, if you repeatedly multiply numbers with magnitude less than one together, your end result would go to zero.



AR(5) data with no noise, poles with magnitude less than one.


This doesn’t look like typical time-series data (e.g. stock prices) at all. Fortunately, adding noise solves this problem. Think of it as increasing the magnitude of your product each time before multiplying it by a new number less than zero:



AR(5) data with Gaussian noise N(0,1), poles with magnitude less than one.

The code to generate these plots is available here.

A tip to make your tests fairer


Here’s a tip – it really helps to generate 3n more datapoints and cut the first 3n datapoints. This is because the first n datapoints are likely to be distributed very differently from the others since they were initialised randomly. These effects will continue till at least the $2n$th datapoint. So if you want to assess your model fairly and prevent it from trying to disproportionately fit to the first n datapoints, take out the first 3n datapoints.

Code to generate AR(n) data


Now, all we have to do is implement this in code. Fortunately for you, I’ve already done it! Here are the key portions of the code. The full code is on GitHub.

Using this to debug models


How does this help debug models? Firstly, the data is simple, so it doesn’t take long to train, and if the model can’t learn AR(5), it likely won’t be able to learn more complicated patterns. I’ve found this particularly useful for debugging recurrent neural networks.

Secondly, since you know the distribution of the data, you can compare model performance to the Bayes error. The Bayes error is the expected error that an oracle which knew the true distribution would make. In this case, it would be the error from predicting using the AR coefficients and noise \epsilon = 0. This can serve as a baseline to compare your model performance to.

Finally, you might be able to interpret the model parameters. For example, if you have a linear model, you can check how the model parameters compare with the AR coefficients.

I hope this has helped – all the best with your machine learning endeavours!


Leave a Reply