Morningstar's Performance Measures 

Star Ratings

 


Star versus Category Ratings

Most of the discussion of the attributes of Morningstar's category ratings applies as well to their three-year star ratings, which we cover here. However there are three key differences. First, cumulative returns are adjusted for any load charges when computing the star ratings but such charges are not taken into account in the category ratings. Second, the bases used to compute compute relative returns and risks are different. A common base, computed using all funds in an asset class, is used for computing star ratings. For category ratings, each group of funds uses a separate base, computed using all funds in the category. Finally, stars are assigned by ranking all funds in an asset class while category ratings are assigned by ranking funds separately within each category.

The differences between the two ratings for diversified equity funds can be seen in the figure below. Each point represents one fund. The horizontal axis plots a fund's category risk-adjusted rating while the vertical axis plots its three-year star risk-adjusted rating.

 

Star versus Category Risk-adjusted Ratings

As can be seen in the figure, there appear to be separate clusters of funds lying along more or less parallel straight lines. This is not surprising. Recall that a fund's category risk-adjusted rating is can be calculated by multiplying the fund's average monthly loss by a constant and then subtracting the result from its its excess value relative. The fund's three-year star risk-adjusted rating is calculated by multiplying its average monthly loss by another constant, then subtracting the result from its load-adjusted excess value relative. For no-load funds, the only differences between the two measures result from the different constants by which average monthly loss is multiplied. For load funds the return measures also differ, but not by large amounts. The net result is that all funds in a given category lie on or near a fairly straight line in the diagram. Funds in categories with better historic performance during the period lie on higher lines; those in categories with poorer historic performance lie on lower lines. While the correlation between the two measures is fairly high (0.887), the differences are highly significant economically.

A fund's star risk-adjusted rating is determined by (1) how well its category performed and (2) how well it performed within its category. Roughly, the influence of the former can be measured by the spread between the top and bottom implicit lines in the figure. A more direct measure can be obtained by computing the mean star risk-adjusted rating for all the funds in each category. The results are shown in the table below.

Category Mean Three-year Star Risk-adjusted Ratings

  Value Blend Growth
Large 0.497 0.430 0.106
Medium 0.158 0.100 -0.290
Small 0.115 -0.219 -0.447

The results are striking. In the 1994-1996 period the average large value fund had a three-year star risk-adjusted rating of 0.497, while the average small growth fund had a rating of -0.447. A small growth fund that did well relative to others in its category would almost certainly receive fewer stars during this period than a large value fund that did poorly relative to others in its category.

The conclusion is inescapable. Star risk-adjusted ratings summarize both the performance of the domain in which a fund operates and the performance of the fund relative to others in its domain. Selection of funds with high star risk-adjusted ratings is far more likely to result in the choice of funds in categories with strong recent performance rather than in funds in categories with poor recent performance. In many cases this is likely to be a poor approach to fund selection. There is little evidence that investment in categories that have done well in the last three years is a superior investment strategy.

 

Star Ratings and Sharpe Ratios

We have shown that ranking funds within a category based on their category risk-adjusted ratings gives results that are similar to those obtained when ranking on the basis of excess return Sharpe ratios, at least when the average performance of the funds in the category has been good. The theoretical and empirical reasons for such results also apply with regard to star risk-adjusted ratings, but the greater diversity of risks and returns within an asset class is likely to make the correspondence somewhat smaller in magnitude. Differences will also be greater due to the inclusion of load charges in one measure but not the other. Nonetheless, the two give quite similar results, as shown in the figure below.

In this case the correlation between the two measures is 0.955 -- lower than that obtained when ranking within categories, but quite high nonetheless. Star risk-adjusted ratings, like category risk-adjusted ratings, appear to be scale-independent measures in utility-based measures' clothing. As such, they share both the good and the bad attributes of the more traditional excess return Sharpe ratios.

 

Star Ratings and Efficient Mean-variance Combinations

Thus far we have concentrated on the risk-adjusted ratings that are used to award stars to funds. We turn finally to the stars themselves. The figure below plots the annualized mean monthly excess return and annualized monthly standard deviation of each fund in our three-year sample. Funds awarded 5 stars are shown by blue circles, those awarded 4 stars by red circles, 3 stars by green plus signs, 2 by blue plus signs and 1 by red plus signs. Despite the many differences in underlying procedures, the end results are remarkably similar. For each level of risk, the 5-star funds generally provided the best performance, the 4 star funds the next best, and so on down to the 1 star funds, which did worst. Traditional mean-variance analysis would thus have reached similar conclusions concerning performance.

Average Excess Return and Standard Deviation of Excess Return by Star Rating

Finally, note that the five clusters in the figure appear to fall along more or less straight lines. This is consistent with the fact that a risk-adjusted rating is equivalent to the sum of normalized measures of risk and return. Consider instead a set of rays from the origin, each representing the locus of funds with the same excess return Sharpe ratio. As can be seen visually, funds with 1 star will generally have the lowest Sharpe ratios, those with 2 stars the next lowest, and so on up to the 5 star funds that will tend to have the highest such ratios. This shows again that within this range of outcomes, rankings by Sharpe ratios are similar to those based on Morningstar's more complex measure.