# R in Time Series: Autocorrelation in R

This tutorial presents basics of autocorrelation in R.

#### 1  Stationality vs Ergodicity

The mean function of a time series model is

$\mu(t)=E(x_t)$

which is, in general, a function of time t. Since $$x_t\,$$ can have different realization at t, above definition of expectation is an average taken across the ensemble of all the possible times series that might have been realized by the time series model.

The ensemble constitutes the entire population. If we have a time series model, we can simulate more than one time series. However, with historical data, we usually only have a single time series. So, all we can do, without assuming a mathematical structure for the trend, is to estimate the mean at each sample point by the corresponding observed values.

In practice, we make estimates of any apparent trend and seasonal effects in our data and remove them, by decompose() or stl() in R, to obtain time series of the random component. Then time series models with a constant mean will be appropriate.

If the mean function is constant,  i.e. $$\mu(t)=\mu$$, we say that the time series model is stationary in the mean. If we assume that a sufficiently long time series characterizes the hypothetical model, then the sample estimate of the population mean $$\mu\,$$ is the sample mean $$\bar{x}$$

$\bar{x}=\frac{1}{n}\sum_{t=1}^nx_t$

and such models are known as ergodic.

• A time series model that is stationary in mean is ergodic in the mean if the time average for a single time series tends to the ensemble mean as the length of the time (temporal) series increases, i.e.
$\lim_{n\rightarrow\infty}\frac{\sum_{t=1}^{n}x_t}{n}=\mu$
It is implied here that the time average is independent of the starting point.
• Environmental or economical time series are single realizations of a hypothetical time series model, and we simply define the underlying model as ergodic. However, there are cases in which we can have many time series arising from the same time series model. In this case, it is straightforward to adapt a stationary time series to be non-ergodic by defining the means for the individual time series to be from some probability distribution.

#### 2  Variance function

The variance function of a time series model that is stationary in the mean is

$\sigma_t^2=E[(x_t-\mu)^2]$

In principal, $$\sigma_t$$ can take a different value at every time t. In other words, being stationary in mean does not necessarily mean being stationary in variance. Unfortunately, we cannot estimate a different variance at each time point from a single time series. To progress, we must make some simplifying assumption.

If we assume that the time series model is stationary in variance, then this constant population variance $$\sigma^2$$ can be estimated from the sample variance

$\text{Var}(x)=\frac{1}{n-1}\sum_{t=1}^n(x_t-\bar{x})^2$

Effect of serial correlation on variance

In a time series analysis, sequential observations may be correlated. If the correlation is positive, Var(x) will tend to underestimate the population variance in a short series because successive observations tend to be relatively similar. In most cases, this does not present a problem since the bias decreases rapidly as the length n of the series increases.

### 3  Autocorrelation

Consider a time series that is stationary in both the mean and the variance. The variables may be serially correlated, and the model is second-order stationary if the serial correlation between variables depends only on the lag, i.e., the number of the steps between them.

For a second-order stationary time series, we can define the autocovariance and autocorrelation functions of lag k as

$\text{acvf:}\quad\gamma_k=E[(x_t-\mu)(x_{t+k}-\mu)]$

$\text{acf:}\quad\rho_k=\frac{\gamma_k}{\sigma^2}$

Notes:

• The functions $$\gamma_k$$ and $$\rho_k$$ do not depend on because $$\mu\,$$ and $$\sigma$$, which are across the ensemble, are the same at all times t.
• $$\rho_0=1$$
• It is possible that a second-order stationary time series model has skewness, i.e. having serial correlations depending on time t. However, applications for such models are rare. Therefore, it is customary to drop the term ‘second-order’ and use ‘stationary’ on its own for a time series that is at least second-order stationary. Note that the term strictly stationary is reserved for more rigorous conditions.

The autocovariance and autocorrelation functions can be estimated from a time series by their sample equivalents

$\text{acvf:}\quad c_k=\frac{1}{n}\sum_{t=1}^{n-k}(x_t-\bar{x})(x_{t+k}-\bar{x})$

$\text{acf:}\quad r_k=\frac{c_k}{c_0}$

Notes: The autocovariance at lag 0, $$c_0$$, is the variance calculated with a denominator n. Also, a denominator n is used when calculating $$c_k$$, although only $$n-k\,$$ terms are added to form the numerator. Adopting this definition constrains all sample autocorrelations to lie between -1 and 1.

### 4  ACF() function in R

The function acf computes (and by default plots) estimates of the autocovariance or autocorrelation function.

Arguments:

•  x, y — a univariate or multivariate numeric time series object or a numeric vector or matrix, or an “acf” object.
• type — character string giving the type of acf to be computed. Allowed values are “correlation” (the default), “covariance” or “partial”.
• plot — logical. If TRUE (the default) the acf is plotted.

Notes:

• For type = “correlation” and “covariance”, the estimates are based on the sample covariance. (The lag 0 autocorrelation is fixed at 1 by convention.)
• The autocorrelations of x are stored in the vector acf(x)$acf, with the lag k autocorrelation located in acf(x)$acf[k+1].

We can draw a scatter plot corresponding to autocorrelation of any lag k by

#### 5  Correlogram

By default, the acf() function produces a plot of $$r_k\sim k$$, which is called the correlogram.

In general, correlograms have some following features:

• The unit of the x-axis, the lags,  is the sampling interval. The y-axix is dimensionless.
• If $$\rho_k$$=0, the sampling distribution of $$\rho_k$$’s estimate, $$r_k$$, is approximately $$\sim N(-\frac{1}{n}, \frac{1}{n})$$. Therefore, on the correlogram, the dotted lines are drawn at $$\frac{1}{n}\pm\frac{2}{\sqrt{n}}$$ such that if $$r_k$$ falls outside these lines, we have evidence at the 5% level to reject the null hypothesis that $$\rho_k=0$$.
• If $$\rho_k$$ does equal 0 at all lags k, we expect 5% of the estimates, $$r_k$$, to fall outside the $$-\frac{1}{n}\pm\frac{2}{\sqrt{n}}$$lines.
• The $$r_k$$ are correlated, so if one falls outside the lines, the neighboring ones are more likely to be statistically significant.
• it is worth looking for statistically significant values at specific lags that have some practical meaning, for example, the lag that corresponds to the seasonal period, if exists. For monthly series, a significant autocorrelation at lag 12 might indicate that the seasonal adjustment is not adequate.
• The lag 0 autocorrelation is always 1 and is shown in the correlogram plot. Its inclusion helps us compare values of the other autocorrelations relative to the theoretical maximum of 1.
• Suppose we have a long time series, i.e. big n, the significant region marked by the two dotted lines will be very shallow. Therefore, small values of $$r_k$$ of no practical consequence may be statistically significant.
• Squaring the autocorrelation can help to give the percentage of variability explained by a linear relationship between the variables. For example, a lag 1 autocorrelation of 0.1 implies that a linear dependency of $$x_t\,$$ on $$x_{t-1}$$ would only explain 1% of the variability of $$x_t$$. So even though this autocorrelation of 0.1 might fall outside the lines, if is of no practical consequence.
• The correlogram above has a well-defined shape that appears like a sampled damped cosine function. This is typical of correlograms of time series generated by an autoregressive model of order 2.

Trend and seasonality exhibited on correlogram

• Usually a trend in the data will show in the correlogram as a slow decay in the autocorrelations, which are large and positive due to similar values in the series occuring close together in time.
• If there are seasonal variation, season spikes will be superimposed on the pattern.
• Some examples
• Example #1: construct a time series consisting of a trend only.

As can be seen, the acf decreases slowly and almost linearly from 1.
• Example #2: If taking a large number of cycles of a discrete sinusoidal wave of any amplitude and phase, the acf is a discrete ocsine function of the same period

• Example #3: If constructing a time series that consists of an arbitrary sequence of p numbers repeated many times, the correlogram has a dominant spike of almost 1 at lag p.

• Example #4:  For the AIr Passenger data with increasing trend

As can be seen:

• The annual cycle appears in the air passenger correlogram as a cycle of the same period superimposed on the gradually decaying ordinates of the acf. This gives a maximum at a lag of 1 year, reflecting a position linear relationship between pairs of variables $$(x_t,x_{t+12})$$ separated by 12-month periods.
• Conversely, because the seasonal trend is approximately sinusoidal, values separated by a period of 6 months will tend to have a negative relationship. For example, higher values tend to occur in the summer months followed by lower values in the winter months. A dip in the acf therefore occurs at lag 6 months (or 0.5 years). Although this is typical for seasonal variation that is approximated by a sinusoidal curve, other series may have patterns, such as high sales at Christmas, that contribute a single spike to the correlogram.

Although correlogram does show some information about trend and seasonal pattermns, its main goal is to detect autocorrelations in the time series after we have removed an estimate of the trend and seasonal variation. Now we remove the trend and seasonality in the Air Passengers data, then plot the correlogram on the remaining random component.

Note that, by using a centered moving average of 12 months to smooth the time series and thereby to estimate the trend, the first 6 and last 6 terms in the random component are not available and stored in R as NA, therefore subset of data [7:138] is used here.
The pattern in above correlogram suggests two possibilities: (1) a damped cosine shape that is characteristic of an autoregressive model of order 2; or (2) the seasonal adjustment has not been entirely effective. We can validate seasonal decomposition by comparing standard deviations of before and after seasonal adjustment., as following:
– SD for original data

– SD for data after trend removed

-SD for random component (i.e. after seasonal adjustment)

Therefore, the reduction in the standard deviation shows that the seasonal adjustment has been very effective.