Answer to Question #242489 in Statistics and Probability for tsa

Question

Answer to Question #242489 in Statistics and Probability for tsa

Question #242489

Define a random walk process and show that it is not stationary. Describe how you can make it stationary. Give an example of a time series that behaves like a random walk

Expert's answer

Random Walk

A random walk is different from a list of random numbers because the next value in the sequence is a modification of the previous value in the sequence.

The process is used to generate the series forces dependence from one-time step to the next. This dependence provides some consistency from step to step rather than the large jumps that a series of independent, random numbers provide.

It is this dependency that gives the process its name as a “random walk” or a “drunkard’s walk”.

A simple model of a random walk is as follows:

Start with a random number of either -1 or 1.
Randomly select a -1 or 1 and add it to the observation from the previous time step.
Repeat step 2 for as long as you like.

More succinctly, we can describe this process as:

y(t) = B0 + B1*X(t-1) + e(t)

Where y(t) is the next value in the series. B0 is a coefficient that if set to a value other than zero adds a constant drift to the random walk. B1 is a coefficient to weight the previous time step and is set to 1.0. X(t-1) is the observation at the previous time step. e(t) is the white noise or random fluctuation at that time.

We can implement this in Python by looping over this process and building up a list of 1,000 time steps for the random walk. The complete example is listed below.

from random import random
from matplotlib import pyplot
seed(1)
random_walk = list()
random_walk.append(-1 if random() < 0.5 else 1)
for i in range(1, 1000):
	movement = -1 if random() < 0.5 else 1
	value = random_walk[i-1] + movement
	random_walk.append(value)
pyplot.plot(random_walk)

pyplot.show()

Random Walk and Stationarity

A stationary time series is one where the values are not a function of time.

Given the way that the random walk is constructed and the results of reviewing the autocorrelation, we know that the observations in a random walk are dependent on time.

The current observation is a random step from the previous observation.

Therefore we can expect a random walk to be non-stationary. In fact, all random walk processes are non-stationary. Note that not all non-stationary time series are random walks.

Additionally, a non-stationary time series does not have a consistent mean and/or variance over time. A review of the random walk line plot might suggest this to be the case.

We can confirm this using a statistical significance test, specifically the Augmented Dickey-Fuller test.

We can perform this test using the adfuller() function in the statsmodels library. The complete example is listed below.

from random import random
from statsmodels.tsa.stattools import adfuller
# generate random walk
seed(1)
random_walk = list()
random_walk.append(-1 if random() < 0.5 else 1)
for i in range(1, 1000):
	movement = -1 if random() < 0.5 else 1
	value = random_walk[i-1] + movement
	random_walk.append(value)
# statistical test
result = adfuller(random_walk)
print('ADF Statistic: %f' % result[0])
print('p-value: %f' % result[1])
print('Critical Values:')
for key, value in result[4].items():

print('\t%s: %.3f' % (key, value))

The null hypothesis of the test is that the time series is non-stationary.

Running the example, we can see that the test statistic value was 0.341605. This is larger than all of the critical values at the 1%, 5%, and 10% confidence levels. Therefore, we can say that the time series does appear to be non-stationary with a low likelihood of the result being a statistical fluke.

ADF Statistic: 0.341605
p-value: 0.979175
Critical Values:
	5%: -2.864
	1%: -3.437

10%: -2.568

We can make the random walk stationary by taking the first difference.

That is replacing each observation as the difference between it and the previous value.

Given the way that this random walk was constructed, we would expect this to result in a time series of -1 and 1 values. This is exactly what we see.

The complete example is listed below.

from random import random
from matplotlib import pyplot
# create random walk
seed(1)
random_walk = list()
random_walk.append(-1 if random() < 0.5 else 1)
for i in range(1, 1000):
	movement = -1 if random() < 0.5 else 1
	value = random_walk[i-1] + movement
	random_walk.append(value)
# take difference
diff = list()
for i in range(1, len(random_walk)):
	value = random_walk[i] - random_walk[i - 1]
	diff.append(value)
# line plot
pyplot.plot(diff)
pyplot.show()

Your time series may be a random walk.

Some ways to check if your time series is a random walk are as follows:

The time series shows a strong temporal dependence that decays linearly or in a similar pattern.
The time series is non-stationary and making it stationary shows no obviously learnable structure in the data.
The persistence model provides the best source of reliable predictions.

This last point is key for time series forecasting. Baseline forecasts with the persistence model quickly flesh out whether you can do significantly better. If you can’t, you’re probably working with a random walk.

Many time series are random walks, particularly those of security prices over time.

The random walk hypothesis is a theory that stock market prices are a random walk and cannot be predicted.

Learn more about our help with Assignments: Statistics and Probability

Comments

No comments. Be the first!

Answer to Question #242489 in Statistics and Probability for tsa

Random Walk

Random Walk and Stationarity

Comments

Leave a comment

Related Questions