Performance Measurement in a Downside Risk Framework
This article was published in the Fall 1994 issue of The Journal of Investing, and is reproduced here with the consent of the publisher, Institutional Investor Magazine.
For subscriptions or reprints call: (212) 224-3185
FRANK A. SORTINO
is director of the Pension Research Institute in San Francisco. He holds an M.B.A. from U.C. Berkeley, and a Ph.D. in finance from the University of Oregon.
See Biography
LEE N. PRICE
is a principal of RCM Capital Management in San Francisco, where he is an equity portfolio manager. He is also president of the Security Analysts of San Francisco and co-chairman of the AIMR performance presentation standards implementation committee. He holds an M.S. degree from M.I.T., and a Ph.D. in business from Stanford University.
INTRODUCTION
At a recent Institutional Investor seminar in San Francisco, David Ballon polled the audience for their views on risk measurement. 30% thought downside risk was superior to traditional risk measures (standard deviation and beta). Only 10% believed traditional risk measures were superior to downside risk. The remaining 60% either didn't know what downside risk was or didn't know enough about it to have an opinion.
In an effort to contribute to institutional investor's understanding of downside risk, we examine the problem of measuring performance on a risk-adjusted basis. Before we begin laying the theoretical foundation for a new alternative in performance measurement, we offer an example using monthly data for the ten years ending December 1992 for two growth stock managers and six indexes.1
EXAMPLE
Imagine you are trying to determine how two of your managers, Twentieth Century Growth Fund (C) and RCM Growth Equity Fund (R), performed relative to some indexes. You are offered two pictures of their performance (see Exhibits 1 and 2). Exhibit 1 shows that Manager C took the most risk, and that ninety-day treasuries (T) were the least risky investment. Exhibit 2, however, indicates that Manager C took less risk than three of the indexes and that Treasuries were the most risky asset.
These are very different pictures of the risk/return trade-off. Which presents the more accurate picture of investment performance? Exhibit 1 uses standard deviation to measure risk, Exhibit 2 uses downside risk. To answer this question we must examine the underlying assumptions one makes about the nature of risk and return under conditions of uncertainty.
Exhibit 1 Exhibit 2

CONCEPTUALIZING RISK
We believe risk is not synonymous with uncertainty. Rather, risk and reward are inseparable components of an uncertain return. The range of possible returns and the probabilities associated with them describe the shape of uncertainty (technically called a probability distribution). To the extent that one can describe the uncertainty associated with an investment, one can manage the risk component with the quantitative tools at hand.
Before the fact (looking forward in Exhibit 3), uncertainty has to do with the range of returns that could occur, and risk has to do with some of those returns. Suppose we assume that uncertainty is bell-shaped (normal). Then, two out of three years the expected return should fall within one standard deviation of the mean (see the three arrows). Does that uncertainty connote risk?
If there is a minimum return that must be earned to accomplish some goal (the minimal acceptable return [MAR]), then any returns below the MAR will produce unfavorable outcomes and any returns greater will produce good outcomes. Risk is associated only with bad outcomes; therefore, only returns below the MAR in Exhibit 3 are associated with risk. The MAR separates the good volatility (above the MAR) from the bad volatility (below the MAR).
After the fact (looking back at the outcome), there is also some uncertainty associated with the outcome, i.e., we only know what did happen, not what could have happened. After the fact, analysis of the data might lead us to believe that the worst possible returns are not as low as we feared, before the fact. In this case, the distribution would be truncated on the downside, causing positive skewness, and we would want to alter our perception of the shape of uncertainty we faced.
Note, then, that accurate descriptions of the shape of uncertainty are just as important for performance measurement as they are for asset allocation. In fact, performance measurement is just the flip side of asset allocation.
It happens that the performance outcome in Exhibit 3 is favorable (above the MAR), but that is not the only return that could have been realized. Some risk was taken to achieve that realized return. We argue that the proper measurement of risk should deal only with the returns that could have been below the MAR. Returns above the MAR should be viewed as a reward. We can never know the true shape of uncertainty (the underlying distribution of returns); it can only be estimated.
Exhibit 3

Exhibit 4 depicts possible shapes for the returns of three managers at any particular time.2 Some of those returns incur risk, others do not.
Exhibit 4

Because standard deviation measures risk as dispersion on either side of the mean, it cannot distinguish between good volatility and bad volatility. Consequently, standard deviation considers A and B equally risky. Both practitioners and academics have recognized the need to make this distinction, resulting in a search for a better risk measure. Several measures claim the title of "downside risk,"3 but we believe that one is superior to all the others. To avoid confusion, we refer to this risk measure as downside deviation (DD). There is a considerable body of work that shows DD has a strong theoretical foundation and can easily be incorporated into the "MPT" risk/return framework.4 DD measures the deviations below the MAR.
To see how DD differs from standard deviation and downside probability, consider the return distributions for Managers A, B, and C in Exhibit 4. Let's assume Manager A invests in a well-diversified portfolio that generates symmetric distributions, and Manager B employs a strategy that earns the same return as Manager A but eliminates some of the lower returns. Manager C invests solely in short-term Treasuries. According to standard deviation, Manager C took the least risk.
Risk of what? Risk of not earning the average return on T-bills? That doesn't capture what's at stake. What concerns this investor is that the minimal return of 10% may not be earned. In this sense, Manager B took less risk than Manager C.
Both DD and downside probability capture this notion of risk. However, downside probability incorrectly indicates that Manager B took more risk than Manager A. This is because downside probability looks only at the chance of failure and does not adequately penalize Manager A for those returns that fall below the lowest returns for Manager B. Only DD correctly shows Manager B to have taken the least risk.
If the MAR is changed to 5%, the calculation for standard deviation is unchanged (the MAR does not enter into the calculation). The DD for Manager A becomes 7%, and for Manager B one-half of 1%. It is intuitively obvious that there is less risk of achieving one's goals if one only has to earn 5% instead of 10%.
DETERMINING THE REFERENCE POINT
Critics of downside risk often claim that if one measures downside risk from the mean of each asset and the distributions of returns are symmetric, downside risk is simply semivariance, and the results would be the same as using standard deviation. We argue that most distributions are not symmetric, and even if they were, investors should not measure risk relative to the mean of each asset.
Consider Managers B and C in Exhibit 4. If risk is measured as deviations below each manager's mean, Manager B would incorrectly appear more risky than Manager C, who has no chance of achieving the MAR (the distance down from the mean of B is greater than the distance from the mean of C).
The arrows show three different reference points from which to start measuring downside risk. In the distribution for fund C, risk is measured downward from the risk-free rate. For B, the reference point is the MAR (10%), and for A it is the mean (20%). The riskiness of these assets is clearly dependent on the reference point from which risk is measured.
We argue that the correct reference point should be directly related to the stated or implied objective of the investor. We assume that a goal is what an investor is trying to accomplish (e.g., retire at age sixty-five), and the objective is concerned with how one achieves that goal, stated in terms of return and risk, e.g., maximize the expected return subject to the risk of falling below the MAR.
If the MAR is 10% then the arrow for Manager B is the correct reference point. The entire distribution of C lies below the MAR of 10%, so it is clearly more risky than B.
The relative riskiness of A and B is not so clear. Even though the mean and standard deviations of A and B are the same, the mode of B (the most likely return) is lower than the mode of A, and the area of B from 10% to zero is greater for B than for A. Thus, the downside probability for B is greater than for A, but the downside risk of A is greater than for B, because the lowest possible returns for A are below those of B.
When might the mean be an appropriate reference point? A rating service like Morningstar cannot use a risk measure unique to the goals or objectives of each subscriber; it must apply to all subscribers. Because most investors must earn something in excess of the T-bill rate to accomplish their financial goals, using the T-bill rate as the MAR would be misleading. It would lead investors to believe that Manager C took less risk than Manager B. But T-bills are risk-free only with respect to default risk.
If Morningstar assumes that subscribers elect to invest in an actively managed equity fund because they must earn at least the average return on a passive market index, then the mean return on the index is a more appropriate surrogate for the MAR than the T-bill rate. In which case, risk is that returns will fall short of the mean return on the appropriate index, and a manager who invests solely in T-bills would be viewed as risky.
Again, measuring risk from the mean of one asset (e.g., an index) is not the same as measuring risk from the mean of each asset. Measuring DD from the mean of C and comparing that to DD from the mean of Manager B makes Manager C look less risky than B. Measuring the risk of Manager C on an absolute basis from the mean of Manager B makes C more risky than B. In other words, DD must be calculated from the same reference point for all assets.
THE RISK/ RETURN FRAMEWORK
A great contribution of modern portfolio theory is the establishment of a formal risk/return framework for investment decision-making. Performance is not just a matter of who got the highest return, or who took the least risk, but a question of who provides the best risk-adjusted return.
A common procedure for incorporating risk into performance measurement is to present a risk/return trade-off (see Exhibit 2). Manager R has a higher return and took less risk than the NYSE, so there is little doubt that R outperformed this index on a risk-adjusted basis. However, fund C has a higher return than R but took more risk. Absent a risk-adjusted return, one cannot say whether fund C outperformed R or the index.
For this reason some authors proffer a ratio of risk-to-return. The reward-to-variability ratio (RVAR) was proposed by William Sharpe and is commonly referred to as the Sharpe ratio. The numerator of the Sharpe ratio is the difference between the return on the portfolio and the risk-free rate. A comparable downside risk ratio that has come to be called the Sortino ratio has for the numerator the difference between the return on the portfolio and the MAR. The denominator for the Sharpe ratio is standard deviation, and for the Sortino ratio it is downside deviation.
Another alternative, suggested by Bill Fouse of Mellon Capital Management, is to apply utility theory. Before the fact, investors do not maximize expected return; they attempt to maximize expected utility. An investor's utility for a risky outcome is the expected return minus some fraction of the risk, where the fraction of risk is an expression of the investor's degree of risk aversion.
Why shouldn't this also hold after the fact? In other words, subtract from the risky return that was realized some fraction of the risk that was taken. Fishburn [1977] has shown that downside deviation is consistent with expected utility theory and derives a utility function that is linear for all returns (r) equal to or greater than the MAR and that exhibits increasing aversion to returns below the MAR. He derives a utility function (see the appendix) for investors who measure risk in terms of DD:
U = r - V(DD)2 (1)
where V = a measure of the investor's degree of risk aversion.
For this study we use V = 1. This means the investor would require more than 200 basis points as a risk premium to choose equities over the riskless asset. This fairly aggressive attitude is justified on the basis of evaluating growth managers.
The main advantage the Fouse index has over the Sharpe or Sortino ratios is that it can accommodate different degrees of risk aversion. The Sharpe ratio indicates how much excess return above the risk-free rate is received for the risk associated with achieving the mean. The Sortino ratio indicates how much excess return above the MAR is received for the risk of not achieving the MAR. The Fouse index indicates the net return earned after subtracting the required risk premium.
We now examine the data in Exhibits 1 and 2 for an investor with a MAR of 8%, a return that many pension sponsors will identify as a return that must be earned in order to fund their defined-benefit plan within their cost constraints. To calculate DD we fit a lognormal distribution to the data and then calculate the downside deviations using integral calculus.5 We assume the risk-free rate is 5% and substitute a money market fund for Treasuries. The risk/return characteristics are given in Exhibit 5, and the rankings are in Exhibit 6.
Exhibit 5
| Asset | Return |
Downside Deviation |
Standard Deviation |
SHARPE |
SORTINO |
FOUSE |
| 20th Century | 17.6 |
10 |
21.6 |
.58 |
.95 |
16.6 |
| FRC 2000 | 11.7 |
12.5 |
19.6 |
.34 |
.30 |
10.2 |
| MM | 7.1 |
1.1 |
.5 |
4.12 |
-.88 |
7.1 |
| Nasdaq | 11.3 |
12.3 |
19.3 |
.33 |
.27 |
9.8 |
| NYSE | 11.5 |
8.4 |
15.3 |
.42 |
.41 |
10.8 |
| RCM | 16.7 |
6.3 |
15.6 |
.75 |
1.39 |
16.3 |
| Small-Cap | 11.7 |
11.7 |
18.3 |
.36 |
.31 |
10.3 |
| World | 15 |
7.7 |
15.4 |
.65 |
.91 |
14.4 |
Exhibit 6
Rankings
| Rank | Sharpe | Sortino | Fouse |
| 1 | MM | RCM | 20th Century |
| 2 | RCM | 20th Century | RCM |
| 3 | World | World | World |
| 4 | 20th Century | NYSE | NYSE |
The Sharpe ratio ranks the money market fund first and Twentieth Century fourth, while the Sortino and Fouse rankings reverse this ordering. To say a money market fund had the best risk-adjusted performance for the decade ending in 1992 would be to encourage many investors to put their money into an asset that guarantees failure to accomplish their goal.
The problem stems from the fact that the Sharpe ratio is blind to the MAR (it does not enter into the calculation). The Fouse index seems easier to interpret than either of the ratios, e.g., 16.6% is the risk-adjusted return for Twentieth Century after deducting the risk premium of 100 basis points (see Exhibit 6).
The meanings of the ratios of 0.58 and 0.95 are less obvious. All three ranking procedures are limited in that they are snapshots of the risk-adjusted returns for a specific period of time. It might be useful to see how performance changes over time in risk/return space.
CONCLUSIONS
We believe the usefulness of DD in risk/return analysis has been adequately demonstrated here and in other studies. Both the Sortino ratio and the Fouse index offer valuable information not available from traditional risk measures. The efforts of Morningstar and other practitioners to produce a meaningful downside risk measure is commendable, for they have focused attention on the demand for such a measure.
Yet downside deviation is the only downside risk measure with a strong theoretical foundation that permits it to be incorporated into asset allocation models or any other aspect of MPT. If we are to make any claims to represent a profession along the lines of medicine or law, we must replace ad hoc tools with a common body of knowledge based on a strong theoretical foundation that can be taught. In this way, business schools can serve the same function in society as medical schools and law schools.
APPENDIX
Fishburn [1977] derives a utility function for an investor who measures risk in terms of deviations below some value (t) but wants to maximize return (r) above t, i.e., the utility function is linear above t. In our study t = MAR.
For all returns less than the MAR, the utility for r = r - V (MAR - r)2, where V = a measure of the investor's degree of risk aversion and r = return measured as a fraction.
Taking the expected value yields:
U = r - V(DD)2
To illustrate, assume an investor has two options: option 1 is a return of t with certainty; option 2 is an equal chance of getting a return of either t - a or t + b, where b = the return above t and a = the return below t.
The utility for option 1 = t
The utility for option 2 = 0.5[(t - a) - Va2 + 0.5(t + b)]
Solving a set of simultaneous equations gives:
0 = b - a - Va2
V = b - a/a2
If a risky asset offers a return of 12% with a standard deviation of 20%,the investor would be indifferent between t with certainty and a risky return (R) - V (DD)2.
The values of V for different risk premiums above the certainty equivalent are:
V Risk Premium
1 200 basis points
2 400 basis points
3 600 basis points
In other words, an aggressive investor with a V = 1 would require more than 200 basis points to choose the risky asset over the riskless asset, while a less risk-averse investor with V = 2 would require more than 400 basis points.6
ENDNOTES
1The indexes are the Frank Russell 2000, NYSE index, Nasdaq, Small-Cap index, World index, and ninety-day Treasury bills. Monthly data are annualized using a time-weighted rate of return (geometric average).
2For example, if a 5% return occurs at times where each of the three distributions (A, B, and C) is located, the realized return could have come from any of the three distributions, and that distribution could be a valid description of uncertainty at every time.
3Markowitz [DATE?] and others refer to semivariance, but the reader is left with the impression that the point from which to measure risk is always the mean. This is unduly restrictive and leads to erroneous conclusions as to which asset is more risky. Shortfall risk and probability of failure have also been examined and found wanting (see Harlow [1991]).
4Fishburn [1977] offers proof that "downside deviation" is consistent with second- and third-degree stochastic dominance and with expected utility theory. Vijay Bawa produced many articles on "mean lower partial moment" rankings; the Journal of Financial Economics, June 1977, offers proof that the mean-variance rankings are a subset of the mean lower partial moment set, and shows how "downside deviation" can be incorporated into an asset allocation optimizer. Harlow and Rao [1989] show how "downside deviation" can be incorporated into the CAPM framework.
5We prefer calculating the DD from a continuous distribution rather than a discrete distribution. Simply calculating the deviations of the actual observations below the MAR is seriously flawed. The procedure described by Marmer and Ng [1993] partially overcomes this problem. The best procedure in our opinion is to fit a three-parameter lognormal distribution to the tenth percentile, mean, and ninetieth percentile.
6We are grateful to Khaled Salama at Brinson Associates for his helpful comments on this concept.
REFERENCES
Fishburn, Peter. "Mean-Risk Analysis with Risk Associated with Below Market Returns." American Economic Review, March 1977.
Harlow, W.V. "Asset Allocation in a Downside Risk Framework." Financial Analysts Journal, September-October 1991.
Harlow, W.V., and Rammish K. Rao. "Asset Pricing in a Mean-Lower Partial Moment Framework." Journal of Financial and Quantitative Analysis, September 1989.
Marmer, Harry, and X. Ng. "Mean Semivariance Analysis of Option-Based Strategies: A Total Asset Mix Perspective." Financial Analysts Journal, June 1993.
Home | Published Papers | Work in Progress
| Past Conferences | Future
Conferences
Contents Copyright© 1998, Pension Research Institute