by Loren Cobb
This essay is written for anyone in the Mechanical Investing community who would like to study the mathematics of investment volatility, in some depth, and for any other quantitatively-inclined investors who may stumble across it while browsing the web. Some familiarity with stochastic processes is required. All comments, corrections, and criticisms will be welcome! (Option investors please note: this essay does not discuss implied volatility at all. That is a different topic, for a different day.)
First, some preliminaries. Let Xt stand for the price of an investment at time t, with time measured in trading days from the initial purchase. If the investment is sold at sale day t>0, then the return of the investment is Rt = Xt / X0. Investment returns are difficult to compare unless they have been adjusted to take into account the length of time over which they are held. In this essay we will use three distinct ways of performing this adjustment, which if not carefully distinguished and understood can lead to great confusion. Here are the three:
CAGR is the customary measure of return when comparing the growth rates of investments. When two potential investment are compared using CAGR, we are implicitly assuming that each CAGR can serve as a prediction of the investment rate of return. Volatility, in contrast, is a measure of the uncertainty of the investment rate of return. In principle, every investment return can be broken down into two parts: one part is the return we expected when we made the purchase, and the other is the difference between observed and expected return. The observed return will be different from what we expected whenever there is volatility due to the random influences.
To be rigorous about volatility we need to specify the probabilistic structure of the process that causes fluctuations in investment prices. One way to do this is to assume that prices vary continuously in time. A single investment followed through time has a price graph that resembles Figure 1 below, a continuous trajectory that is so irregular that its first derivative almost never exists. Stated in other words, it is constantly changing direction in response to tiny random fluctuations. The histogram of final prices is shown in color in the right-hand margin of the graph; the colors run from yellow for the bin with the highest count, through red and magenta, to blue for bins counts of zero.
Despite the fact that the first derivative never exists, it is still possible to state how these tiny random fluctuations cause the price of the investment to change. For example, if the random fluctuations resemble Brownian motion, the microscopic motions of a small particle floating in a liquid, then the random increments will be normally distributed with an infinitesimal variance. This model was first advanced in 1900 by a young French doctoral student of economics named Louis Bachelier. His idea was so radical (and his reputation so small) that it was ignored completely until Albert Einstein independently rediscovered it a few years later.
Expressed in modern notation and brought up to date, here is an improved version of Louis Bachelier's model. Let δXt refer to the change in price Xt over a very small but finite interval δt. As noted above, this change can be broken conceptually into two parts, one predicted and the other unexpected. If the predicted rate of change over δt is ρ, then the size of the predicted change will be ρXtδt. Let us now suppose that the variance of the unexpected part of the price change is proportional to Xtδt. In other words, the increments of Xt are mutually independent and normally distributed:
- δXt | (Xt = x) ~ N[ ρxδt, (σx)2δt ].
Pure Brownian motion is a similar but simpler process, with mean zero and variance δt. This simpler idea is closer to what Louis Bachelier actually proposed for the movement of stock prices. For any finite δt, the increments of Brownian motion are distributed normally:
- δBt ~ N[ 0, δt ].
In the limit as δt goes to zero, the relationship between the investment price process and Brownian motion can be neatly and conveniently expressed as a stochastic differential equation (SDE):
dXt = ρXt dt + σXt dBt
|
Geometric Brownian Motion
|
Note how the increment in price in this formula has been expressed as the sum of two parts. The first term is the expected change, and the second term is the random change. The mean of the random change is zero, and its standard deviation is the infinitesimal quantity σXtdt. The parameters ρ and σ in this model are the fundamental measures of growth and volatility for the investment price process.
If there is no volatility whatsoever, i.e. σ = 0, then the above model simplifies to the ordinary differential equation dX/dt = ρX, which has the solution Xt = eρt X0. From this it is clear that the parameter ρ in the SDE is the continuously compounded growth rate (CCGR), also known as the instantaneous growth rate. If the unit of time is the year, then we can calculate the CCGR from CAGR (the compound annualize growth rate) as CCGR = ln( CAGR ). If the unit of time is the day, and we assume 252 trading days in each year, then the relationship is CCGR = (ln CAGR)/252.
The volatility displayed in Figure 1 is highly asymmetrical. The pale red envelope encloses the region in which the price trajectory is most likely to move; this region clearly expands much more rapidly in the direction of higher prices than it does towards lower. This region is known as the "2-sigma" envelope.
If we transform all prices by taking their logarithms and then replot the graph, then three nice things happen:
This is shown in Figure 2. Notice that the 2-sigma envelope now looks like a quadratic curve laid on its side and skewed linearly upwards so that its central axis is the linear predicted log price trajectory for the investment.
In terms of the Geometric Brownian Motion model for investment volatility, we need to apply a change of variable formula to the model, using the transformation Yt = ln Xt. Changing variables in an SDE is not quite as easy as it is with ordinary differential equations, because any nonlinear transformation alters not only the variable but also the shape and moments of the probability distribution. Itô's change of variable formula, when applied to the logarithmic transformation of price, yields this result:
Notice that this SDE implies that Yt–Y0 ~ N[ (ρ – ½ σ2)t, σ2t ]. In other words, the log return at time t, i.e. ln( Rt ), is distributed normally with mean (ρ – ½ σ2)t and variance σ2t. This is clearly visible in Figure 2: the central red line is the graph of Y0 + (ρ – ½ σ2)t, the histogram on the right is the normal distribution at time t=252, and the standard deviation of Yt increases like t1/2σ.
Because log returns are normally distributed, we know immediately that the ordinary return Rt has the log-normal distribution. This implies that the median of Rt is the exponential of the mean of ln( Rt ), and therefore the central pale red line in Figure 1 traces out the trajectory of the median return for the investment. In particular,
So the effect of increasing volatility is to lower the median rate of return that a population of independent investors will receive. When comparing investment vehicles with differing volatilities, it would be preferable to compare median annualized growth rates, not raw CAGRs.
Just as with measures of growth, there are several different ways of measuring volatility which, if not carefully distinguished and understood, can lead to great confusion. Suppose we start with a large number of daily observations of the price of the investment, with time measured in years so that δt = 1/252. To help with the notation, let D[ X ] stand for the standard deviation of a random variable X, i.e.
Then the daily volatility is
This relationship suggests that we can empirically estimate the volatility σ as the standard deviation of a series of daily observations of
If data are available only once per month, i.e. 12 times per year, then the standard deviation of the monthly observations will be σ/(12)1/2. Similarly, if the data are only annual, then the standard deviation of the annual observations will be simply σ.
If time is measured in trading days, rather than years, and observations are still made daily, then δt = 1 day, and the daily volatility is
This demonstrates why it is so important to pay close attention to the units in which time is measured, as well as the number of observations per unit of time.
It has become customary in the Mechanical Investing community to measure volatility with a statistic known as the Geometric Standard Deviation (GSD), which is defined as the exponential of the annual volatility:
By convention, CAGR and GSD figures are reported in "percentage" terms, where the following relationships apply:
To summarize, when setting out to measure volatility or growth, three decisions need to be made in advance: (a) the units in which time is measured, (b) the number of observations per time unit, and (c) whether the result is to be given in instantaneous or annualized form. Confusion can be avoided only when all three decisions are made with total clarity.
The colored histogram shown in Figure 1 is based on 25,000 independent runs of a simulated investment whose growth rate and volatility are both very high (CAGR% = 100, GSD% = 100). The effect of the high volatility is very clear in this figure: some investors may easily see a return in excess of 600%, while others may lose over 60%. Consequently, their initial uncertainty as to the outcome of their investments is very large indeed.
The simulation itself was based on the stochastic difference equation
where ρ = ln( 2 ) / 252 = 0.002751, σ = ln( 2 ) / 2521/2 = 0.043664, and δt = 1 day. Each simulation was run for 252 days from a starting value of $10. The above stochastic difference equation works very well as a simulation of the true SDE, which cannot be simulated due to the infinitesimal time step dt, but only because the given time step δt = 1 is reasonably small compared to the duration of the simulation, 252 trading days.
When the time step is a significant fraction of the total duration, then another simulation method must be used. For example, suppose that we need to simulate an investment growth process over 10 years, with a time step of δt = 1 year. In this case we simulate the logarithm of the process, Yt = ln Xt, and use
In either case, the simulation requires the generation of a series of high-quality independent normally-distributed random numbers. The typical built-in random number generator that is supplied with most computer languages is not of high enough quality to support the simulation of stochastic differential equations. Experience with the simulations that were performed to test the results in this essay suggests the observed volatility will be substantially less than theory predicts when a deficient random number generator is used. One solution for this problem is to hand-code a custom random number generator using, for example, the portable Ran1 algorithm described in Numerical Recipes in C by William Press et al, published 1988 by Cambridge University Press. The Ran1 algorithm uses one linear congruential generator to produce the high-order bits of the result, a second to produce the low-order bits, and a third to shuffle the sequence of output numbers to remove periodicities.
When an empirical statistic such as the CAGR is measured on a series of independent samples of data, we observe a distribution of values for the statistic. To treat this sampling varation mathematically, we consider each datum to be a random variable, and the estimator to be a transformation of those random variables. Thus the estimator itself is a random variable, and its theoretical variance can be used to estimate the sampling variation of the statistic in the real world.
In the case of the growth rate, the estimator is constructed from an average of normally-distributed random variables. As one might expect, the sampling distribution of a growth rate ρ depends strongly on the volatility σ of the growth process. Here is a formula for the limits of the 95% confidence interval for the estimated CCGR, denoted here by r:
The notation tα,ν refers to the point x on Student's t distribution with &nu degrees of freedom, such that P{ t > x } = α. When calculating r from N observations, the degrees of freedom are just &nu = N–1. When &nu > 30, the approximation tα,ν = Zα can be used, where the notation Zα refers to the point on the standard normal distribution such that P{ Z > x } = α. For the 95% confidence interval, Z0.025 = 1.96.
Example:
Over the 17 year period 1986 through 2002, the RRS189 monthly 5-stock screen had an observed CAGR% of 42 and GSD%(D) of 55. What is the 95% confidence interval for the CAGR%?
First we calculate an estimate r for the CCGR, namely r = ln( 1.42 ) = 0.3507. The degrees of freedom are &nu = (17)(12)–1 = 203. This is comfortably higher than 30, allowing us to use the normal approximation Z0.025 = 1.96 in place of the t-statistic. Our estimate for the instantaneous volatility is σ = ln( 1.55 ) = 0.4383, with time measured in years. Thus the limits of the 95% confidence interval are 0.3507 ± (1.96)(0.4383)/2031/2 = 0.3507 ± 0.0603. Converting each limit to a CAGR, using the relationship CAGR = exp[ ρ ], we get:
- Lower limit for CAGR = exp[ 0.3507 – 0.0603 ] = 1.34,
- Upper limit for CAGR = exp[ 0.3507 + 0.0603 ] = 1.51.
We conclude that the 95% confidence interval for the CAGR% of RRS189 runs between 34 and 51, with 42 as the best point estimate.
It may be argued by some that an observed CAGR is not random at all, because it represents reality, and therefore it is inappropriate to calculate a confidence interval around the observed CAGR. While this is superficially true, it misses the point of carrying out this calculation. We are not actually interested in what the true rate of return might have been for that particular investment, because we already know exactly what it was. Instead, we want the answer to a different question: "How much volatility-induced variation should we expect in point estimates of the CAGR, across an ensemble of independent similar investments with the same CAGR and GSD?" The answer to that question is what the confidence interval provides.
It is important to remember that sampling variation is just one source of the variation that can occur in the estimation of a statistic. Systematic errors such as might be caused by changes in the way the variable is measured, or structural changes in the process itself, can contribute large amounts of additional variability. Even typographical errors in data transcription, whenever they occur, add to the observed variability. Therefore, sampling variation must be seen as a lower bound for the total variability of any statistic.
In the case of volatility, the estimator is constructed from a sum of squared normally-distributed random variables. For this reason, the sampling distribution of σ2 is closely related to the Chi-square distribution. Here is a formula for the limits of the 95% confidence interval for the estimated variance s2 around the unknown true variance σ2:
where s2 is the estimated variance, σ2 is the unknown true variance, A = χ2(α/2, &nu) and B = χ2(1–α/2, &nu). The notation χ2(α, &nu) refers to the point x on the Chi-square distribution with &nu degrees of freedom, such that P{ χ2 > x } = α. When calculating σ2 from T observations, the degrees of freedom are just &nu = T–1. Here is a summary table of the some useful Chi-square points for various lengths of backtests, with α = 5%:
Years of Backtest
|
(&nu/A) for Daily Data
|
(&nu/B) for Daily Data
|
(&nu/A) for Monthly Data
|
(&nu/B) for Monthly Data
|
(&nu/A) for Annual Data
|
(&nu/B) for Annual Data
|
5
|
0.926
|
1.083
|
0.72
|
1.49
|
0.36
|
8.26
|
10
|
0.947
|
1.058
|
0.79
|
1.31
|
0.47
|
3.33
|
15
|
0.956
|
1.047
|
0.82
|
1.24
|
0.54
|
2.49
|
20
|
0.962
|
1.040
|
0.84
|
1.21
|
0.58
|
2.13
|
25
|
0.966
|
1.036
|
0.86
|
1.18
|
0.61
|
1.94
|
30
|
0.969
|
1.033
|
0.87
|
1.16
|
0.63
|
1.81
|
Example 1:
Suppose that a 10-year backtest of an annual investment screen has been performed, with time is measured in years and one observation per year. The observed standard deviation of the logarithms of the ten annual returns is s = 0.262. The GSD is exp( 0.262 ) = 1.30, so the observed GSD% is 30. What are the limits of the 95% confidence interval around this estimate of 30?
First, we calculate s2 = (0.262)2 = 0.06864. The lower limit for the 95% confidence interval around s2 is (0.47)(0.06864) = 0.03226, and the upper limit is (3.33)(0.06864) = 0.2286. Now convert each limit to a GSD, using the relationship GSD = exp[ σ ], as follows:
- Lower limit for GSD = exp[ (0.03226)1/2 ] = 1.20,
- Upper limit for GSD = exp[ (0.2286)1/2 ] = 1.61.
We conclude that the 95% confidence interval for GSD% runs between 20 and 61, with 30 as the best point estimate.
Example 2:
Suppose that a 5-year backtest of a monthly investment screen has been performed, with time is measured in years and 12 observations per year. The observed standard deviation of the logarithms of the 60 monthly returns is 0.1265. The annualized volatility is s = (0.1265)(12)1/2 = 0.4382. The GSD is exp( 0.4382 ) = 1.55, so the observed GSD% is 55. What are the limits of the 95% confidence interval around this estimate of 55?
First, we calculate s2 = (0.4382)2 = 0.1920. The lower limit for the 95% confidence interval around s2 is (0.72)(0.1920) = 0.138, and the upper limit is (1.49)(0.1920) = 0.286. Now convert each limit to a GSD, using the relationship GSD = exp[ σ ], as follows:
- Lower limit for GSD = exp[ (0.138)1/2 ] = 1.45,
- Upper limit for GSD = exp[ (0.286)1/2 ] = 1.71.
We conclude that the 95% confidence interval for GSD% runs between 45 and 71, with 55 as the best point estimate.
Example 3:
As of mid-May 2002, General Electric had a 30-year total return of 23.89, including dividends and adjusting for splits. Its CAGR was 1.11158, its DGR was 1.0004194, and its CCGR was 0.10578 (time measured in years). The observed standard deviation of the logarithms of the 7569 daily returns was 0.016238. With time measured in years, its estimated instantaneous volatility is s = (0.016238)(252)1/2 = 0.25777. What are the limits of the 95% confidence interval around this estimate for σ?
First, we calculate s2 = (0.25777)2 = 0.066446. The lower limit for the 95% confidence interval around s2 is (0.969)(0.066446) = 0.0644, and the upper limit is (1.033)(0.066446) = 0.0686. Taking square roots, we find:
- Lower limit for σ = (0.0644)1/2 = 0.254,
- Upper limit for σ = (0.02724)1/2 = 0.262.
We conclude that the 95% confidence interval for σ runs between 0.254 and 0.262, with 0.258 as the best point estimate.
The first draft of this essay was completed on 20 May 2002. It was last revised on 9 July 2006.