Introduction to Time Series II
Author: Edwin Bedolla
Date: Original, 6th April 2020. This version, 7th February 2021.
In this document, the main statistics such as the mean function, autocovariance function and autocorrelation function will be described, along with some examples.
We will import all of the necessary modules first.
x
begin using StatsPlots using Random using TimeSeries using Dates using Statistics using DataFramesendxxxxxxxxxxgr();xxxxxxxxxx# Ensure reproducibility of the resultsrng = MersenneTwister(8092);Descriptive statistics and measures
A full description of a given time series is always given by the joint distribution function of the time series which is a multi-dimensional function that is very difficult to track for most of the time series that are dealt with.
Instead, we usually work with what's known as the marginal distribution function defined as
where
and when both functions exist they can provide all the information needed to do meaningful analysis of the time series.
Mean function
With these functions we can now define one of the most important descriptive measures, the mean function which is defined as
where
Autocovariance and autocorrelation
We are also interested in analyzing the dependence or lack of between realization values in different time periods, i.e.
The first one is known as the autocovariance function and it's defined as
where
The autocovariance tells us about the linear dependence between two points on the same time series observed at different times.
Normally, we know from classical statistics that if for a given time series
We now introduce the autocorrelation function (ACF) and it's defined as
which is a measure of predictability and we can define it in words as follows
The autocorrelation measures the linear predictability of a given time series
at time , using values from the same time series but at time .
This measure is very much related toPearson's correlation coefficient from classical statistics, which is a way to measure the relationship between values.
The range of values for the ACF is
Example
Let's look at an example for the particular case of the moving average. We will be working out the analytic form of the autocovariance function and ACF for the moving average while also providing the same results numerically using Julia.
Recall the 3-valued moving average to be defined as
Let's plot the moving average again. We will create a very big time series for the sake of numerical approximation below.
First, we create the white noise time series.
xxxxxxxxxx# Create a range of time for a year, spaced evenly every 1 minutedates = DateTime(2018, 1, 1, 1):Dates.Minute(1):DateTime(2018, 12, 31, 24);xxxxxxxxxx# Build a TimeSeries object with the specified time range and white noisets = TimeArray(dates, randn(rng, length(dates)));xxxxxxxxxx# Create a DataFrame of the TimeSeries for easier handlingdf_ts = DataFrame(ts);Then, as before, we compute the 3-valued moving average.
xxxxxxxxxx# Compute the 3-valued moving averagemoving_average = moving(mean, ts, 3);xxxxxxxxxx# Create a DataFrame of the TimeSeries for easier handlingdf_average = DataFrame(moving_average);Recall what these look like in a plot. We just plot the first 100 elements in the time series to avoid having a very cluttered plot.
xxxxxxxxxx# Indices to plotidxs = 1:100;x
begin df_ts plot(:timestamp[idxs], :A[idxs], label = "White noise") df_average plot!(:timestamp[idxs], :A[idxs], label = "Moving average")endWe are now ready to do some calculations. First, we invoke the definition of the autocovariance function and apply it to the moving average
and now we need to look at some special cases.
When
we now have the following
then, by the property of covariance of linear combinations we have the following simplification
and because
In this case, recall that our white noise is normally distributed
0.3333333333333333x
true_γ = 3 / 9We will try to compute the autocovariance function using classical statistics by means of the cov function in Julia. We need to pass it the time series like so
0.33351433375298867xxxxxxxxxxγ_jl = cov(df_average[:, :A], df_average[:, :A])And we can see that the value is quite similar. The error must come from the fact that we may need a bigger ensemble of values, but this should suffice.
When
we now have the following
So the true value is now
0.2222222222222222xxxxxxxxxxtrue_γ1 = 2 / 9To check this, we perform the same operations as before, but this time, we need to move the time series one time step with respect to itself.
0.22247427287331825xxxxxxxxxx# Remove the last element from the first and start with the second elementγ_jl1 = cov(df_average[1:(end-1), :A], df_average[2:end, :A])Great! Within a tolerance value, this is quite a nice estimate. It turns out that for the cases
0.00043227901205739896xxxxxxxxxx# Remove the last element from the first and start with the second elementγ_jl_zero = cov(df_average[1:(end-3), :A], df_average[4:end, :A])It's actually true, a value very close to zero but, ¿why? It's easy to see if one applies the autocovariance function definition and checks the case
Let's now focus on the ACF for a 3-valued moving average. We have several cases, like before.
When
we now have the following
so it turns out that the true value is cor function to compute the correlation coefficient in Julia as an estimate for the ACF
1.0x
ρ_est = cor(df_average[:, :A], df_average[:, :A])When
we now have the following
recall from before that
which is the true value
0.6666666666666666xxxxxxxxxxtrue_ρ2 = 2 / 3and again, we can check this value numerically
0.667060850012139xxxxxxxxxxρ_est2 = cor(df_average[1:(end-1), :A], df_average[2:end, :A])Lastly, like with the autocovariance, the ACF for the cases
0.0012961321840730129xxxxxxxxxxρ_est_zero2 = cor(df_average[1:(end-3), :A], df_average[4:end, :A])