From time to time, mechanical investors have debated the question of the size and accuracy of the observed maximum drawdown of a screen. For example, Len Kogan commented in 2002 that *"Drawdown, along with -2 and -3sigma are better indicators [of risk] than the usually observed standard deviation and Sharpe Ratio volatility indicators. Drawdown, not volatility, is what we fear the most, second only to inadequate returns."*

In the ensuing discussion, Eric Mintz replied, *"The data we've seen so far supports the idea that the size of past drawdowns is a very poor predictor of future drawdowns. GSD is a vastly superior predictor of drawdowns."*

Elan Caspi illustrated his doubts about drawdowns by repeating an experiment 10 times. Each time, he simulated 150 monthly returns of a screen with CAGR = 42 and GSD = 35, and calculated the maximum drawdown. He found considerable variation in the size of the maximum drawdown, and concluded from this that Eric Mintz is right, the maximum drawdown is too unreliable.

I would like to provide some theoretical support for this position, via an examination of a statistic that is closely related but much easier to analyze: the maximum monthly drop. I will give formulas that show exactly how unreliable an observed maximum monthly drop statistic really is. Along the way I will show how to estimate the most likely size of the maximum monthly drop, and its sampling variability.

There is an obscure corner of the theory of probability and statistics which describes the behavior of statistics such as the maximum monthly drop. This little niche is known as "Extreme Value Theory." To give you an idea of just how small it is, it occupies a mere 5 pages within the 2100 pages of Kendall's Advanced Theory of Statistics.

The size of the maximum drop that may be observed in a series of monthly returns of a stock screen is a random variable that has a probability distribution. Extreme Value Theory tells us how to find an approximation for this distribution, and how it depends on the number of months over which the screen is to be observed.

If we make the traditional (but frequently questioned) assumption that the logarithms of the monthly returns are independent and normally distributed, then the probability distribution of the smallest log return converges to the Gumbel distribution as the number of months goes to infinity. We shall use this asymptotic distribution to characterize the size of the maximum monthly drop. As an approximation, it works quite well.

Within a series of N monthly returns, the estimated size of the smallest standardized log return is:

(log( log( N ) ) + log( 4 pi )) / ( 2 sqrt( 2 log( N ) ) ) - sqrt( 2 log( N ) ).

To destandardize this value, multiply by log( GSD )/sqrt(12) and add log( CAGR )/12, where CAGR and GSD are expressed in "multiplier" form and all logarithms are natural (i.e. not base 10). Then exponentiate. The result is the estimated size of the smallest return.

For example, Elan simulated 150 monthly returns of a screen with CAGR = 1.42 and GSD = 1.35. Thus N = 150 and the expected smallest standardized log return is 2.511. Destandardizing this value and exponentiating, we obtain a smallest return of 83%, i.e. a predicted maximum monthly drop of 17%. This prediction is the *most likely* value for the maximum monthly drop. In other words, it is the mode of the extreme value distribution, which is skewed, rather than its mean or median.

I have prepared a Excel workbook with which to calculate the most likely maximum monthly drop for any monthly screen. This workbook contains two sheets, one for the theoretical calculation, and another for a simulation which demonstrates the accuracy of the theory.

The asymptotic distribution of extreme values conveys a lot of information about the accuracy of an empirical maximum monthly drop. As the number of monthly observations increases, the variability of the size of the maximum drop decreases. If we measure this sampling variability with the standard deviation (SD), then the formula for this dependency upon N is given by the following formula. The sampling variability of the standardized maximum monthly log return is

SD(N) = pi / sqrt( 12 log( N ) ).

In Elan's example, this variability is 0.405. As before, we need to destandardize and exponentiate to convert this result into useful units. The spreadsheet shows how to use this variability to calculate 1- and 2-sigma ranges around the predicted maximum monthly drop.

As useful and interesting as these numeric quantities may be, the real message of the theory of extreme values is to be found in the dependency of the variability of the observed maximum monthly drop on the number of monthly observations. The formula for SD(N) clearly shows that this variability is inversely proportional to the square root of the logarithm of N. In sharp contrast, the variability of the GSD is inversely proportional to the square root of N itself. From this we conclude that as the sample size increases, the accuracy of an observed GSD becomes far superior to the accuracy of an observed maximum monthly drop.

For the purposes of estimating risk it is clearly more informative to look at the *predicted* maximum monthly drop, as calculated from the GSD and N, than to look at any empirical maximum monthly drop. This is because the predicted maximum is based on the GSD, a statistic that becomes increasingly more reliable than an empirical maximum as the sample size increases. The rate of increase in reliability is proportional to the square root of the logarithm of the sample size.

The maximum drawdown statistic is much harder to analyze, because it is not a simple maximum. It is, in fact, the maximum of the running maximum monthly drop. Here is a precise definition:

- Let P(t) be the value of a portfolio at time t, where t is an integer in the range between 0 and T, inclusive.
- Let D(s) be drawdown up to time s, defined as D(s) = max{ P(0), ..., P(s-1) } - P(s).
- Then the maximum drawdown at time t is given by M(t) = max{ D(0), ..., D(t) }.

It makes good sense to think of "maximum drawdown" as the maximum of a running maximum, and this feature also explains why the concept is so elusive. Being a maximum, however, it still obeys Extreme Value Theory. Thus the same general conclusion holds: its variability becomes increasingly worse than the variability of the GSD measure of volatility. For this reason, it is better to judge risk from observed volatility than from observed maximum drawdown.

It is certainly reasonable to wonder what happens to all this theory if the fundamental assumption quoted above, to the effect that the logarithms of monthly returns are normally distributed, is violated. One could argue, for example, that Extreme Value Theory is sensitive to the size of the tails of the log return distribution, and that therefore a fat-tailed hypothesis will throw everything into doubt. Fortunately, one of the remarkable things about Extreme Value Theory is the robustness of the Gumbel distribution. In order to invalidate the above theory, the tails would have to be so fat as to be asymptotically polynomial rather than exponential in shape. This is quite a drastic condition, which I do not believe applies. If I am wrong then a different part of Extreme Value Theory should be used, with somewhat different formulas. Even so, my ultimate conclusion remains unaltered, because the sampling variability of the maximum monthly drop still depends on the logarithm of N.

This is a corrected and revised version of an essay that first appeared as a post to the Mechanical Investing board of the The Motley Fool, on 29 August 2002. The original post was also reprinted by The Motley Fool as the post of the day.

Date of this revision: 30 October 2003.