Introduction to Time Series III

Author: Edwin Bedolla

Data: Original, May 6th, 2020. This version, 7th February 2021.

This document closes the introduction to Time Series by exploring the last concepts which are the estimation of the autocovariance and autocorrelation functions, with a simple example.

As always, we import the needed packages.

1.9 ms
16.4 s
4.1 ms
10.5 ms

Estimation of correlation

In the last part we studied the theoretical definitions of the autocorrelation (ACF) and autocovariance functions (AVF), where we did some extensive examples on how to compute the theoretical values.

But in reality we only have sampled data and we usually just have a subset or sample of the original population. When that is the case, we can only estimate the respective statistics by using estimated methods. In this document we will explore the definitions and algorithms to estimate the autocorrelation and autocovariance functions, we close the document exploring the lag operator and the importance of it.

We start with a simple example.

19.7 μs
38.7 ms
24.2 ms

Autoregressive time series

In this section we will build an autoregressive time series; this is a big topic within Time Series analysis so we will have a special chapter for that, but here we want to introduce the concept.

An autoregressive time series takes information from past intervals to build future information. We will build an autoregressive time series now and plot it.

10.1 μs
138 ms
884 ms
38.0 ns
12.1 ms

As we can see, this time series is not really different from the ones that we have studied so far, but one of the fundamental properties of the autoregressive model or time series is that is shows periodicity; we will explore all of this later.

9.6 μs

Correlation of an autoregressive time series

As we said before, we are concerned with estimating the ACF and AVF of a time series when we only have sampled data at hand. In the case of the autoregressive time series we are studying, I have obtained the true value of the AVF, γ which is the following

10.9 μs
58.0 ns

We compare it to Julia's built-in function to compute the covariance value of the time series.

7.6 μs
julia_γ
1.4910680073384028
156 ms

Autocovariance function

We now turn our attention to defining the estimated, unbiased AVF, which is defined as follows

γ^(h)=1nt=1nh(xt+hx¯)(xtx¯)

where the sample mean, x¯, is estimated as

x¯=1NiNxi

We will now implement the estimated AVF in Julia.

14.7 μs
32.9 μs

Let us now use this new function and compute the estimated AVF.

7.4 μs
γ_by_hand
1.4910651701271396
18.8 ms

We can see that the results are very similar to those obtained with the built-in cov, and very close to the true theoretical value. We can obtain the true theoretical value with increasing samples from the time series.

6.4 μs

Autocorrelation function

A similar estimator for the ACF can be used, and in particular we have

ρ^(h)=γ^(h)γ^(0)=t=1Nh(xt+hx¯)(xtx¯)t=1N(xtx¯)2

so we can use this to our advantage and employ the AVF to compute the ACF.

In our implementation we exploit this property bellow.

10.3 μs
92.8 μs

We now compute the estimated ACF. Notice that we need to specify the value of h=1 which adds a lag to the time series.

9.9 μs
ρ_by_hand
-0.4700826372561362
59.0 ms

We now compare it to the one obtained by the Pearson correlation function with the built-in cor function in Julia.

8.2 μs
ρ_julia
-0.4700833786660821
104 ms

The results are quite similar, just as we expected.

5.6 μs

Lag in Time Series

Up until now we have mentioned several times the word lag in different scenarios. It is now time to address the meaning of this word in the context of Time Series. Time Series are a process in different time intervals, and we are sometimes interested in what happens between two time intervals s and t within the same Time Series.

A special type of operation, called the lag operator produces the value of a previous time interval with respect to the current time interval. The lag operator is denoted by L and when applied to a Time Series X={x1,x2,x3,} we get the following

Lxk=xk1

When doing something like this df_ar[1:(end-1), :A] we are essentially applying the lag operator by hand, were the lag is one in this case. The TimeSeries package has support for this. Let us do a simple example computing the ACF like before.

10.5 μs
2.7 ms
123 ms
5.6 ms
5.2 ms
true
54.0 ns

We can go back to an arbitrary time interval if we raise the lag operator to a given power, like so

Lnxk=xkn

for example L2xk=xx2 so we actually got the previous-to-last value from the time series.

Furthermore, because the lag operator can be raised to arbitrary powers we can build polynomials out of the lag operator, like so

a(L)=a0+a1L+a2L2+a(L)xt=a0xt+a1xt1+a2xt2+a3xt3

which resembles an autoregressive time series. In fact it is the use in autoregressive models and moving averages where the lag operator is of the utmost importance.

For example, in this document we used the autoregressive model

yt=5+xt0.7xt1

which could be written in terms of the lag operator as

yt=5+xt0.7Lxt=5+xt(10.7L).

We will come back to this when we study the ARIMA processes later on.

22.6 μs