Introduction to Time Series II
Author: Edwin Bedolla
Date: Original, 6th April 2020. This version, 7th February 2021.
In this document, the main statistics such as the mean function, autocovariance function and autocorrelation function will be described, along with some examples.
We will import all of the necessary modules first.
x
begin
using StatsPlots
using Random
using TimeSeries
using Dates
using Statistics
using DataFrames
end
xxxxxxxxxx
gr();
xxxxxxxxxx
# Ensure reproducibility of the results
rng = MersenneTwister(8092);
Descriptive statistics and measures
A full description of a given time series is always given by the joint distribution function of the time series which is a multi-dimensional function that is very difficult to track for most of the time series that are dealt with.
Instead, we usually work with what's known as the marginal distribution function defined as
where
and when both functions exist they can provide all the information needed to do meaningful analysis of the time series.
Mean function
With these functions we can now define one of the most important descriptive measures, the mean function which is defined as
where
Autocovariance and autocorrelation
We are also interested in analyzing the dependence or lack of between realization values in different time periods, i.e.
The first one is known as the autocovariance function and it's defined as
where
The autocovariance tells us about the linear dependence between two points on the same time series observed at different times.
Normally, we know from classical statistics that if for a given time series
We now introduce the autocorrelation function (ACF) and it's defined as
which is a measure of predictability and we can define it in words as follows
The autocorrelation measures the linear predictability of a given time series
at time , using values from the same time series but at time .
This measure is very much related toPearson's correlation coefficient from classical statistics, which is a way to measure the relationship between values.
The range of values for the ACF is
Example
Let's look at an example for the particular case of the moving average. We will be working out the analytic form of the autocovariance function and ACF for the moving average while also providing the same results numerically using Julia
.
Recall the 3-valued moving average to be defined as
Let's plot the moving average again. We will create a very big time series for the sake of numerical approximation below.
First, we create the white noise time series.
xxxxxxxxxx
# Create a range of time for a year, spaced evenly every 1 minute
dates = DateTime(2018, 1, 1, 1):Dates.Minute(1):DateTime(2018, 12, 31, 24);
xxxxxxxxxx
# Build a TimeSeries object with the specified time range and white noise
ts = TimeArray(dates, randn(rng, length(dates)));
xxxxxxxxxx
# Create a DataFrame of the TimeSeries for easier handling
df_ts = DataFrame(ts);
Then, as before, we compute the 3-valued moving average.
xxxxxxxxxx
# Compute the 3-valued moving average
moving_average = moving(mean, ts, 3);
xxxxxxxxxx
# Create a DataFrame of the TimeSeries for easier handling
df_average = DataFrame(moving_average);
Recall what these look like in a plot. We just plot the first 100 elements in the time series to avoid having a very cluttered plot.
xxxxxxxxxx
# Indices to plot
idxs = 1:100;
x
begin
df_ts plot(:timestamp[idxs], :A[idxs], label = "White noise")
df_average plot!(:timestamp[idxs], :A[idxs], label = "Moving average")
end
We are now ready to do some calculations. First, we invoke the definition of the autocovariance function and apply it to the moving average
and now we need to look at some special cases.
When
we now have the following
then, by the property of covariance of linear combinations we have the following simplification
and because
In this case, recall that our white noise is normally distributed
0.3333333333333333
x
true_γ = 3 / 9
We will try to compute the autocovariance function using classical statistics by means of the cov
function in Julia
. We need to pass it the time series like so
0.33351433375298867
xxxxxxxxxx
γ_jl = cov(df_average[:, :A], df_average[:, :A])
And we can see that the value is quite similar. The error must come from the fact that we may need a bigger ensemble of values, but this should suffice.
When
we now have the following
So the true value is now
0.2222222222222222
xxxxxxxxxx
true_γ1 = 2 / 9
To check this, we perform the same operations as before, but this time, we need to move the time series one time step with respect to itself.
0.22247427287331825
xxxxxxxxxx
# Remove the last element from the first and start with the second element
γ_jl1 = cov(df_average[1:(end-1), :A], df_average[2:end, :A])
Great! Within a tolerance value, this is quite a nice estimate. It turns out that for the cases
0.00043227901205739896
xxxxxxxxxx
# Remove the last element from the first and start with the second element
γ_jl_zero = cov(df_average[1:(end-3), :A], df_average[4:end, :A])
It's actually true, a value very close to zero but, ¿why? It's easy to see if one applies the autocovariance function definition and checks the case
Let's now focus on the ACF for a 3-valued moving average. We have several cases, like before.
When
we now have the following
so it turns out that the true value is cor
function to compute the correlation coefficient in Julia
as an estimate for the ACF
1.0
x
ρ_est = cor(df_average[:, :A], df_average[:, :A])
When
we now have the following
recall from before that
which is the true value
0.6666666666666666
xxxxxxxxxx
true_ρ2 = 2 / 3
and again, we can check this value numerically
0.667060850012139
xxxxxxxxxx
ρ_est2 = cor(df_average[1:(end-1), :A], df_average[2:end, :A])
Lastly, like with the autocovariance, the ACF for the cases
0.0012961321840730129
xxxxxxxxxx
ρ_est_zero2 = cor(df_average[1:(end-3), :A], df_average[4:end, :A])