John Norstad

j-norstad@northwestern.edu

December 1, 2005

Revised: November 3, 2011

Up to my Finance page.

We develop a simple coin tossing game to explore the notions of reversion to mean ("RTM"), forecasting market returns, and market timing strategies. We mention recent research that casts doubts on both RTM and the ability to forecast returns.

Mean Reversion

Forecasting and Timing

Cheating

Common Sense and Where It Goes Wrong

Conclusion

Homework Problems

Imagine a game where we make a long sequence of coin tosses. Make a graph of the tosses. Each time we get heads, go up one unit on the graph. Each time we get tails, go down one unit.

If the coin is fair, so the probability that we get heads or tails on any given toss is exactly 50/50, no matter what happened on previous tosses, we get a pure random walk with no reversion to mean ("RTM").

The graph will go up and down. It may or may not cross the horizontal axis one or more times. It will wander around randomly. Every once in a while we'll get a long sequence of heads by chance, and the graph will go way up (bull market in coin flips - maybe we'll even see a coin flip bubble or two!). Every once in a while we'll get a long sequence of tails by chance, and the graph will go way down (bear market or crash). In long sequences of coin flips, these kinds of runs of heads or tails are much more common than most people think.

The **expected** ending value of the graph is 0. This means that, before the game starts, if we had to make a prediction of what the ending value will be, our best guess ("most unbiased estimate") would be 0. If the coin is fair, we "expect" that about half the time it will come up heads, and about half the time it will come up tails. It we play the game many times, we expect that about half the time the ending value will be above 0, and about half the time the ending value will be below 0.

The **actual** ending value after any single play of the game will most likely be some other number, above 0 or below 0. In fact, if you run simulations on a computer, or work out the probability equations, you learn that the ending value is often way above 0, or way below 0, and the more coin flips you make, the farther away from 0 you tend to get. The expected (average) ending value stays at 0 no matter how many coin flips you make, but the expected (average) distance away from 0 increases as you make more coin flips. Some people find this surprising, but it's true.

It's important to be clear about this difference between the concept of the **expected** outcome and the **actual** outcome. The **expected** outcome is 0, but we also **expect** that the **actual** outcome will be different from the **expected** outcome, and the more flips we make, the larger we expect this difference to be.

Here's a sample graph of one thousand random coin flips generated using Microsoft Excel:

Note that the ending value 24 is well above the expected ending value of 0. In this simulation, by chance, 512 of the flips came up heads, and 488 of them came up tails, which is 24 more heads than tails. Also note that large up and down swings are quite common in the graph. We can even see a clear "bubble" early on in the simulation. There's a major "bull" market in heads where the graph rises quickly from below 0 to about 24, followed by a "crash" where the graph value quickly declines back down to about 9. Then the market appears to move "sideways" for a long time, followed by another major "bull" market right before the end of the simulation.

If you repeat this simulation many times in Excel, you see that about half the time the graph ends up below 0, and about half the time it ends up above 0, but it almost never ends up at exactly 0. In fact, more often than not, the ending value is quite far way from 0. In most of the simulations, it is easy to see patterns that are very similar to what we call bull markets, bear markets, crashes, panics, and bubbles.

Our simple coin flipping game is certainly not the same as stock investing, but it is similar to investing. We have made several simplifications to make it easier to think about the issues.

The first simplification is that the coin flipping game has an expected outcome of 0. Stock investing has a positive expected outcome. For the sake of an example, let's use the common estimate of 10% for expected yearly stock returns. In the coin flipping game, at each flip we have a 50/50 chance of being above or below the expected value of 0. In our stock investing example, at the end of each year we have a 50/50 chance of getting a return above or below the expected value of 10%. With stock investing, the graphs have a distinct upward trend or "drift," and the expected value after many years of investing is well above 0. It is this positive expected return for stocks that makes them an attractive investment, as opposed to the zero expected return in the coin flipping game, which makes it an unattractive investment. The large expected return for stocks is the "premium" investors demand as compensation for the risk they undertake.

As in the coin flipping game, it is important to keep in mind the distinction between the **expected** outcome of stock investing (10% yearly in our example) and the **actual** outcome. The actual outcome is rarely equal to the expected outcome. As with coin flipping, the longer we invest the farther away we can expect to end up from the expected outcome, as measured by the difference between our portfolio's expected dollar ending value and it's actual dollar ending value.

Another simplification is that the coin flipping game has discrete outcomes. There are only two possibilities on each flip - heads or tails. With stock investing, there is a continuous range of many possible outcomes each year.

A final difference, and a subtle one, is that in investing returns compound over time. When we do the math, we have to worry about the difference between arithmetic means and geometric means (average and "annualized" returns), and we have to use logarithms and other kinds of algebraic manipulations which make the equations quite complicated. Fortunately, for the purposes of this article, we do not need to worry about these complexities, and I have taken the liberty of being slightly imprecise about these issues, without in any way having these liberties affect the reasoning about the issues being discussed or the conclusions I will reach.

Have we oversimplified investing by comparing it to our coin-flipping game? Are stock returns really just a random walk? This is the question I will try to examine. I will start by trying to get a solid understanding of what the question means, and I will argue that it is possible to try to answer the question using empirical research that examines the historical evidence. I will discuss two recent academic studies which seem to indicate that stock investing is much more like our coin-flipping game than most people think.

Investopedia defines mean reversion as follows:

The mean reversion strategy is based on the mathematical premise that all prices will eventually move back towards the mean or average return. Thus, if a stock is underperforming, its price will move towards its average value when the market rebounds.

As an example of mean reversion, suppose that our coin has a memory. Whenever we've gotten more heads than tails, so the graph is above the horizontal axis, the probability of getting another head is lower - say the odds in this case are 40/60 heads/tails. Conversely, whenever we've gotten more tails than heads so far, so the graph is below the horizontal axis, the probability of getting another tail is lower - say the odds in this case are 60/40 heads/tails. This is an example of RTM. (In statistics, the "stationary processes" used to model RTM are more complicated than this. I've simplified and distorted the math to make it easier to think about.)

With RTM, the expected value (0 in this example) has a tendency to "pull" the graph back towards itself (back towards the horizontal axis in our example). We say that over time the graph values tend to "revert" to their mean value of 0.

Without RTM, there is no such "pulling" effect. The graph may at some point be 100 units above or below the horizontal axis, but the next flip still has 50/50 odds.

With RTM, the ending value of the graph after a large number of flips tends to be closer to 0 than without RTM, even though we still have the situation that the ending value is rarely exactly equal to 0. (I hope everyone sees why this is true.) Thus there is less volatility of the ending value over long runs of coin flips. If you are playing a game with someone where you get paid dollars for ending values above 0, but you have to pay dollars for ending values below 0, the game is less risky for you over the long run if there is RTM.

Similarly, with investing, if stock returns exhibit RTM, then long sequences of above average returns are more likely to be followed by below average returns, and vice-versa. The average stock return (e.g., 10% yearly) has a "pulling" effect, and we say that stock returns tend to "revert" to their mean value of 10%. If this RTM effect is in fact true, then the long run risks of stock investing are lower than they would be without RTM.

Suppose we have a long historical record of tosses of some coin. How might we check the data to see whether or not the coin had RTM? What we could do is check the standard deviations of the ending values over long sequences of flips. With RTM the standard deviations should be lower than without RTM. This is in fact the standard statistical test for RTM, and we can use the same simple test to check for evidence of RTM in stock returns in the historical data.

In a paper titled The Long-Term Risks of Global Stock Markets, Philippe Jorion has analyzed long-term global stock market return data. He found that long-horizon standard deviations were consistent with what we would expect from pure random walks without any RTM, and he concluded that there was therefore no evidence of any reversion to mean in the global data:

This research investigates the persistence of investment risk across time horizon, a crucial issue in asset allocation decisions. Previous empirical results have focused mainly on US data and suffer from limited sample size in the analysis of long-horizon returns. Investigation of a long-term sample of thirty countries provides additional empirical evidence. The results are not reassuring. There is no evidence of long-term mean reversion in the expanded data sample. Downside risk is not reduced as the horizon lengthens.

This result comes as a big surprise to many people. Many people seem to have an intuitive belief in strong mean reversion in stock returns, often expressed in terms like "the ups and downs of the market even out over time." This common opinion is often used to argue that stocks are safe investments for the long run, or at least they are safer than they are over the short run. Jorion's results contradict this popular opinion.

Jorion and others have in fact found evidence of RTM in the 20th century US stock market, although nowhere near as much as most people believe. Jorion says that this may be due to small sample size and survivorship bias issues. It is also possible that weak RTM may be a characteristic of unusually strong uninterrupted economies that experience steady growth over a long period of time. If we do not make the assumption that this kind of good economic growth will necessarily continue over our future investing horizon, relying on RTM over that horizon appears to be unwise.

Suppose we're somewhere in the middle of our coin-flipping game. Without RTM, there's no way to forecast the next coin toss - there's an even 50/50 chance of heads or tails. With our RTM example, if the graph is currently above the axis, we can forecast tails, and if the graph is currently below the axis, we can forecast heads, and in each case our forecast has a 60% chance of being right.

Now imagine that right before each coin toss you have the opportunity to invest in the outcome (be in the market), or choose not to invest (keep your money out of the market). If the next flip comes up heads, your investment goes up, otherwise it goes down. Without RTM, there's no timing strategy. With RTM, there is a strategy - keep your money in the market when the graph is below the axis, and take it out when the graph is above the axis. You will do much better with this timing strategy than would someone who kept his money in the market all the time.

In this example with RTM, we know the exact algorithm in advance, before the coin is tossed. We know that the mean value to which the coin flips will revert is exactly 0. This is what makes the forecasting and timing work.

Similarly, with investing, if returns tend to revert to mean, and if we know the mean to which they revert in advance, we can do the same kind of forecasting and timing as in the coin tossing example. If recent returns have been above average, we can forecast below average returns in the future, and lighten up on stocks in our portfolio. If recent returns have been below average, we can forecast above average future returns, and change our portfolio to allocate a higher portion to stocks. This kind of forecasting and contrarian market timing should, over time, on average, result in significantly higher total returns than we would get with a buy-hold-and-rebalance strategy.

If stock returns have RTM, but we do not know the true mean to which they revert in advance, forecasting and market timing are more problematic, and the exact strategies are less obvious, but it still seems reasonable that there are relatively simple strategies that will work and beat the market over the long run.

Consider once again the pure random walk coin tossing game without RTM. We said there was no timing strategy in this case. But now suppose we find a crystal ball before the game starts that tells us what the ending value will be when the game ends. Recall that this actual ending value is likely to be well above or below 0. Draw a straight line on the empty graph from the starting point to the known ending point. Start playing the game. Whenever the graph is above the line, forecast tails and take your money off the table. Whenever the graph is below the line, forecast heads and put your money back on the table. It should be easy to convince yourself that your forecasts will be much more accurate than 50/50, and you will win with your timing strategy ("win" in the sense that you will do much better than someone who does not forecast or time). This is even without RTM!

Similarly, with investing, if we could somehow know what the future average return will be in advance, we could market time even without RTM.

Today, for example, we know that the average return over the last 75 years is about 10% annualized. Get into a time machine and go back to 1930. Invest for the next 75 years. Whenever the cumulative annualized returns since 1930 go above 10%, lighten up on stocks. Whenever the cumulative annualized returns since 1930 go below 10%, put more money back into stocks. By 2005, you will have beaten the market by a very nice margin.

This is called an "in-sample" test. It has an obvious flaw, because investors in 1930 did not have any idea what the average annualized return was going to be over the next 75 years. They only knew what the past average annualized returns were. If you do the test again and only permit investors to use the information available to them at the time (an "out-of-sample" test), the market timing strategy doesn't work.

This is a simple kind of "chartist" timing, based just on past returns. When past returns are high, lighten up on stocks. When past returns are low, put more money into stocks. In a pure random walk without a crystal ball, we know that this kind of timing doesn't work. The reason it doesn't work is because without the crystal ball, we are unable to define the notions of "low" and "high." "Low" means "below the future average value" and "high" means "above the future average value," but we don't know the future average value. We only know the past average value, and that information is of no use in a pure random walk without RTM.

Most forecasting methods and timing strategies based on the forecasts are more sophisticated. They usually use fundamental financial ratios like D/P (dividend-to-price ratio) or P/E (price-to-earning ratio) to make the forecasts. The argument is that these ratios are sometimes high and sometimes low, but it is unreasonable to think that they can possibly grow or shrink without bounds ("wander off to infinity," as the academics often like to say it). It is much more reasonable to think that while they sometimes get very high or very low, they must eventually revert to some kind of more normal level. RTM, in other words. If these ratios have RTM, it is quite sensible to hypothesize that this RTM in the ratios induces a similar RTM effect in returns, and that the ratios can be used to forecast future returns.

Does this kind of fundamental forecasting actually work? While the general idea certainly seems more than plausible, the proof is in the pudding, and the theories need to be tested. It is possible to examine the historical record to see if the various schemes would have worked in the past. Many people have done these kinds of studies, both in the popular financial world and in the academic financial world.

The key point is that when back-testing these kinds of fundamental forecasting methods to see if they would have worked in the past, it is cheating if you use the actual means of the fundamental forecasting variables calculated over the entire period of the test, because that information was not available to investors in the past. You must back-test using only information available at the time. In other words, you must do out-of-sample tests, not in-sample tests. Most of the popular studies which reach the conclusion that returns are predictable are invalid for this reason. Surprisingly, many of the academic studies seem to suffer from the same fatal flaw.

Amit Goyal and Ivo Welch discuss and explore this insight in their paper A Comprehensive Look at The Empirical Performance of Equity Premium Prediction. When they did out-of-sample tests of all of the popular forecasting variables, including D/P and P/E, they found that none of them worked:

Our paper explores the out-of-sample performance of these variables, and finds that not a single one would have helped a real-world investor outpredicting the then-prevailing historical equity premium mean. Most would have outright hurt. Therefore, we find that, for all practical purposes, the equity premium has not been predictable.

This result also surprises a great many people. The common wisdom is that future stock market returns are highly predictable using common valuation measures like D/P and P/E. Goyal and Welch's research indicates that this belief, like so many others, may be just another example of how people are often fooled by randomness and see patterns in random data that aren't really there.

There is still controversy in the academic community about whether or not stock returns are predictable, and to what degree they might be predictable, and what the best forecasting variables might be. Goyal and Welch have cast doubt on this hypothesis, and they have performed the valuable service of demonstrating how important it is use only out-of-sample tests, but research and debate continues. In any case, predictability, if it exists at all, is clearly much weaker and more difficult to exploit than most people think.

Here's what I think happens with many people.

We see stock market charts. Our eye draws the line from the starting point to the ending point. We notice that the chart goes up and down, but eventually it always comes back to that nice straight line in the middle of all the jagged ups and downs. Our common sense mistakenly calls this "mean reversion," and we think we are seeing something significant, when what we are really seeing is just a useless triviality (what we are seeing is an immediate consequence of the definition of "average" - if you take an average of things, some of the things are above the average, and some are below, and that information is not of much significance or use).

When our common sense gets fooled this way, we incorrectly conclude that if the market or a stock goes way down for a long time, it **must** eventually come back up again and make up for its losses, because of "mean reversion." Similarly, if it goes way up for a long time, it **must** eventually come back down again and lose a large part of its gains, because of "mean reversion." We may even extend the illusion to conclude that we can take advantage of the patterns we think we are seeing, and use them to forecast future patterns.

We see this kind of "common sense" every day, where people often say things like "mid cap value has been on a roll, I think it's going to mean revert soon," or "emerging market stocks have been falling for a long time, so now is a good time to buy," or "I think investor sentiment is swinging back from optimism to pessimism, and the market is about to come back down after its recent runup," or "as a young investor, I hope the market will suffer a major decline, because my future purchases will be made at a lower price." (The last statement is common, and it implicitly assumes a strong RTM effect, because it assumes that a major decline will necessarily result in significantly higher future returns than we would get without the decline.)

The trivial kind of "mean reversion" described above is in fact true, and our common sense rightly tells us so, but the key point is that it is only true **ex-post**. In other words, our common sense fails when it doesn't remember that all of this is only true **after the fact**. At any given point in time, investors have no way to see that nice straight line in the middle leading into the future. They can only see the one that leads into the past.

The kind of "mean reversion" used in math, statistics, and finance is a different beast entirely from this kind of "mean reversion" our common sense sees.

I strongly suspect that this is at the heart of most of the confusion surrounding this topic.

Let me restate it another way, as a more concrete example. The following statements are both true:

(1) Returns from 1930-2005 fluctuated around ("reverted to," if you insist) the mean return measured over 1930-2005.

(2) Returns from 2005-2080 will fluctuate around (will "revert to," if you insist) the mean return measured over 2005-2080.

Common sense starts with these facts and leaps to the following conclusion:

(3) Returns from 2005-2080 will fluctuate around ("revert to") the mean return measured over 1930-2005.

Despite what common sense might lead us to believe, (3) is not a necessary consequence of (1) and (2). Statements (1) and (2) are true but trivial. Statement (3) is far from trivial - it has real content and, if true, it would be very a significant statement indeed, with major implications for investors.

Note that statement (3) remains suspect even if we believe that the 1930-2005 mean return is our best (most unbiased) estimate of the 2005-2080 mean return. The reason it remains suspect is because we also know that the chance that our expected graph ending value will actually match the realized value in 2080 is vanishingly small, and in fact we expect it not to match by a very large margin! This is not a contradiction or something strange or unusual. It's the same phenomenon that we saw in the coin game, where the expected outcome is 0, but the expected distance of the outcome from 0 is large, and grows larger with time.

Is stock investing like our random walk coin-flipping game, or isn't it? Do stock returns exhibit mean reversion? Are returns predictable? Can we use predictability to profitably time the market?

The first step in thinking about these questions is to understand what they mean and get beyond the "common sense" interpretations, which are so often misleading. We first must clearly understand the difference between an **expected** outcome and the **actual** outcome. Then we must understand the difference between the concepts of **fluctuating around an ex-post mean** and **reversion to mean**. Finally, it is very important to understand the difference between **in-sample** and **out-of-sample** tests, and to understand why in-sample tests are invalid.

These questions about RTM in stock returns and the predictability of stock returns are important and controversial. There are good logical arguments on both sides of the debate. Fortunately, while it is difficult and we have to be careful to use correct statistical methods, it is possible to do empirical work using historical market return data to attempt to find the answers. I have mentioned two such careful empirical studies which cast serious doubts on both RTM and predictability and timing.

Long horizon stock investing appears to be much riskier than most people believe. Predicting future stock returns and timing the market appear to be much harder than most people think.

Despite these conclusions, stock investing remains attractive because of its high expected returns. These returns are only **expected**, however, they are not **guaranteed**, not even over long time horizons (especially not over long time horizons!) Risk is real, and attempts to dismiss long horizon risk via naive popular beliefs in strong RTM, accurate forecasting, and effective market timing are misguided, most likely the result of our common sense being "fooled by randomness."

Consider the following two statements:

A. We know that the average P/E ratio over the last 75 years is about 14. We can use historical market data to test strategies that buy stocks when the P/E ratio is below 14 and sell them when it is above 14. The strategies work very nicely and beat the market by a healthy margin in these tests. Therefore market timing works, and the same strategies will continue to work in the future.

B. Today, the P/E ratio is well above 14. Therefore, future stock market returns will be well below average.

Problem 1 (easy): What's wrong with these seductive arguments?

Problem 2 (hard - extra-credit :-): Examine the 10,000 books, articles and papers that have been published promoting a bewildering variety of market forecasting and timing systems. Count up how many of them used in-sample means or otherwise used in-sample data to calibrate their models in the back-tests of their forecasting equations and timing strategies.

Up to my Finance page.