Copyright 1997 by the American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use. American Medical Association, 515 N. State St, Chicago, IL 60610. Volume 278(9), 3 September 1997, pp 729-732

The Effects of the Change in the NRMP Matching Algorithm

[Original Contribution]

Roth, Alvin E. PhD; Peranson, Elliott MASc

From the Department of Economics, University of Pittsburgh, Pittsburgh, Pa (Dr Roth), and National Matching Services Inc, Toronto, Ontario (Mr Peranson).

Outline

Abstract
BACKGROUND

Stable Matchings in Simple and Complex Markets
The Former NRMP Algorithm

DESIGN OF THE NEW APPLICANT-PROPOSING ALGORITHM
DESCRIPTIVE STATISTICS AND MATCH RESULTS
DIFFERENCES IN THE MATCHES PRODUCED BY THE 2 ALGORITHMS
DIFFERENCES IN SENSITIVITY TO PARTICIPANT BEHAVIOR
WHY THE DIFFERENCES ARE SMALL: INSIGHTS FROM SIMPLE MARKETS
CONCLUSION
REFERENCES

Graphics

Table 1
Table 2
Table 3

Abstract ^

Context: Following 2 years of heated controversy about the resident match, the National Resident Matching Program (NRMP) recently voted to replace the existing matching algorithm with a newly designed applicant-proposing algorithm.

Objective: To design an applicant-proposing algorithm for the match and compare it with the existing NRMP algorithm to determine how many applicants and residency programs could be expected to receive better or worse matches from the 2 algorithms, how the different algorithms influence the opportunity for strategic behavior, and what advice can be given to participants.

Design: Computational experiments compared the newly designed applicant-proposing algorithm with the existing NRMP algorithm on the rank order lists (ROLs) submitted by all applicants and residency programs in the 1987 and 1993 through 1996 NRMP matches.

Results: Differences in the matchings produced by the 2 algorithms are small: fewer than 1 in 1000 applicants would have received a different match. Most (but not all) of the few applicants who are matched to different positions by the 2 algorithms do better when the applicant-proposing algorithm is used; the opposite is true for programs. Opportunities for profitable strategic behavior are very rare for both applicants and programs under either algorithm. With either algorithm, both applicants and programs can be advised that trying to get a preferred match by behaving strategically is far more likely to harm than to help them.

Conclusions: The existing NRMP algorithm and the newly designed applicant-proposing algorithm perform similarly. Both algorithms make it sensible for applicants and residency programs to arrange their ROLs based solely on their preferences for possible matches. The choice of algorithms will systematically affect the matches of only a small group of applicants (<0.1%). The NRMP's recent decision to use the applicant-proposing algorithm starting in 1998 reflects a judgment about the impact of this difference on applicants and programs.

JAMA.1997;278:729-732

IN MAY OF 1997, the board of directors of the National Resident Matching Program (NRMP) voted to replace the existing matching algorithm with a newly designed applicant-proposing algorithm beginning in 1998. This resolved 2 years of heated and sometimes acrimonious controversy about the match, beginning most visibly with an exchange in Academic Medicine in 1995. [1-4] In reaction to this exchange, groups such as the American Medical Student Association, the Public Citizen's Health Research Group, and the Medical Student Section of the American Medical Association (AMA) advocated that the matching algorithm be changed or that the description of the match be changed to give applicants more accurate advice about how to participate. [5,6] At issue was how much the former (pre-1998) matching algorithm is biased toward residency programs at the expense of applicants, whether it would be feasible to replace that algorithm with one more favorable to applicants, how much this would change the outcome of the match, and what advice can be given to applicants about how to fill out their rank order lists (ROLs). The NRMP asked the authors of this article to study these questions.

This article addresses the above questions by comparing the former NRMP algorithm with the newly designed applicant-proposing algorithm capable of handling the NRMP's special features. These special features lie at the heart of the controversy, because they make the NRMP more complex than the simpler "2-sided matching markets" primarily discussed in the game theory literature, to which all parties in the controversy have referred. [7] Critics of the NRMP interpreted this literature as showing that the existing algorithm was biased and could easily be improved and that the advice given by the NRMP to applicants was flawed. The most impassioned of the criticisms went on to suggest that there was an element of perfidy in the reluctance of the NRMP to make the changes demanded and in the history of how the match had been constructed and advertised to past generations of participants. In contrast, defenders of the NRMP were inclined to view the matching algorithm as having evolved from the traditional recruitment process in which programs offer positions to applicants and as having adapted to changes in the medical market, acquiring special features as necessary.

The NRMP match is complicated by 4 special features that create linkages between applicants or programs that do not exist in simpler matching markets. These are the following: couples seeking 2 related positions; applicants who simultaneously seek a program year 2 (PGY-2) position and a complementary program year 1 (PGY-1) position from supplemental ROLs; residency programs whose positions revert to other programs if unfilled; and programs that request an even or odd number of matches. Because of these match variations, the theory for simpler markets provides only an approximate guide to the NRMP.

Computational experiments were therefore conducted on the ROLs submitted by all applicants and residency programs in the matches from 1987 and 1993 through 1996. The recent matches reflect contemporary preferences; 1987, which had the lowest rate of unmatched US seniors in the available dataset (6.0%, as opposed to the historically high rate of 7.5% for 1996), provides a comparison over a longer period.

BACKGROUND ^

Stable Matchings in Simple and Complex Markets ^

Centralized matching algorithms are mostly unsuccessful unless they produce "stable" matchings. [8-10] Briefly, a matching between residents and residency programs is stable if there are no applicant-program pairs such that the applicant prefers the program to his or her current match and the program also prefers the applicant to one of its current matches. [11] The former NRMP algorithm produces stable matches. [8-10] So, too, do the algorithms used in some regions of the National Health Service in the United Kingdom. Other regions tried algorithms that produced unstable matchings; these mostly failed and were abandoned. [9,12,13] Both the present study and the controversy that preceded it focus on algorithms that produce stable matchings.

For simple markets without the NRMP match variations, we know the following about stable matchings and algorithms [7]:

1. At least 1 stable matching can always be found, no matter what ROLs are submitted.

2. The set of stable matchings always contains a "program-optimal" stable matching, and an "applicant-optimal" stable matching.

3. The program-optimal stable matching is produced by a "program-proposing" algorithm, in which residency programs make offers to applicants, starting from the top of each program's ROL and allowing applicants to hold at any point in the algorithm the most preferred offer among those so far received. The applicant-optimal stable matching is produced by an analogous "applicant-proposing" algorithm.

4. The same applicants are unmatched and the same positions are unfilled at every stable matching.

5. When the applicant-proposing algorithm is used, no applicant can possibly improve his match by submitting a ROL different from his true preferences. No parallel assertion can be made about residency programs offering more than 1 position.

When the NRMP match variations are present, none of these 5 assertions are correct in all cases. Markets with match variations can be constructed for which no stable matchings exist [8] or no optimal stable matchings for either side exist (even when stable matchings exist [14]); therefore, no algorithm always produces an optimal stable matching for either side of the market. Furthermore, different stable matchings may have different applicants unmatched and positions unfilled, and no algorithm always guarantees applicants that stating their true preferences is an optimal strategy.

A major focus of this study was to assess the extent to which these possibilities occur in the NRMP matches. While a stable matching has been found in every NRMP match at least since the mid 1970s, it appears that no stable matching is precisely program optimal or applicant optimal in any of the years examined. However, we will see that applicant-proposing and program-proposing algorithms continue to perform approximately as they do in simple markets.

The Former NRMP Algorithm ^

The NRMP algorithm used through the 1997 match handled match variations through a 3-phase process. The first phase produces an initial match by ignoring most match variations using a program-proposing algorithm modified to let couples temporarily hold many offers. This match will generally be unstable because of the way couples are handled and because other match variations are ignored. The second phase of the algorithm identifies instabilities. The third phase fixes these instabilities to produce a stable match. The former NRMP algorithm is a hybrid: it is program proposing in its first phase (which performs the bulk of the matching) and applicant proposing in some parts of its third phase. [15]

DESIGN OF THE NEW APPLICANT-PROPOSING ALGORITHM ^

A conceptual design for the applicant-proposing algorithm, based on the algorithm outlined by Roth and Vande Vate [16] and on components of the existing algorithm, was formulated and circulated for comment. [15] To code a working algorithm, choices had to be made concerning the sequencing of operations. Sequencing decisions can be shown to have no effect on the outcome of simple matches but could effect the outcome when match variations are present (because optimal sable matches may not exist). Computational experiments to test the effects of sequencing were conducted using data from 3 NRMP matches: 1993, 1994, and 1995. The effects of sequencing on the match produced were found to be both unsystematic and tiny; the modal number of applicants affected when sequencing effects were found was 2. Based on this result, sequencing decisions were made that resulted in the most direct convergence to a stable match. A full account of the design process and the results of the sequencing experiments is available elsewhere. [17]

DESCRIPTIVE STATISTICS AND MATCH RESULTS ^

(Table 1) describes the NRMP matches in the years studied. About 4% of the more than 20 000 applicants each year participate as couples, and 8% to 12% submit supplemental ROLs. In the 1990s, about 7% of the 3000 to 4000 programs that participate have positions that revert to other programs if they remain unfilled. Match variations are clearly a substantial part of the match.

Table 1.-Descriptive Statistics and Original National Resident Matching Program Match Results*

DIFFERENCES IN THE MATCHES PRODUCED BY THE 2 ALGORITHMS ^

(Table 2) compares the matches produced by the former NRMP algorithm and the new applicant-proposing algorithm. Less than 0.1% of applicants are affected by the change in algorithms, and of these, most (but not all) prefer the applicant-proposing algorithm.

Table 2.-Comparison of Results Between Original National Resident Matching Program (NRMP) Algorithm and Applicant-Proposing Algorithm

Equally few programs are affected by the change of algorithms-these constitute about 0.5% of all programs. Most, but not all, of these programs prefer the former NRMP algorithm, but in 1994 and 1996, slightly more programs would even have preferred the applicant-proposing algorithm to the former algorithm. In most programs that receive a different match, only 1 applicant differs between the matches produced by the 2 algorithms. The majority of differences involve filling a position with a different applicant; very few positions go from filled to unfilled or vice versa.

To see how match variations cause some applicants to do worse and some programs to do better with the applicant-proposing algorithm, suppose that switching to the applicant-proposing algorithm causes applicant A to improve her match from her second choice to her first choice. It may be that her first choice now requires a supplemental PGY-1 match. If this new supplemental match displaces another applicant less preferred by the PGY-1 program, that applicant is forced to go farther down his list (ie, does worse), while that PGY-1 program does better.

DIFFERENCES IN SENSITIVITY TO PARTICIPANT BEHAVIOR ^

The above comparisons involve ROLs submitted for matches made using the former NRMP algorithm. A comparison of the 2 algorithms also requires consideration of whether participants might have reason to submit different ROLs if the new algorithm were used instead. For this purpose, we considered how many participants could have favorably influenced the match under either algorithm by submitting different ROLs.

For matching markets without match variations, there are strategic opportunities for applicants when the program-proposing algorithm is used. [7] Applicants who can do better by not submitting their true preferences as ROLs can do so by submitting a truncation of their true preferences. Furthermore, the only applicants who can do better by not submitting their true preferences are those who would have received a different match from the applicant-proposing algorithm. This identifies an upper bound on the number of applicants who could possibly profit from strategic behavior of this sort. Note that even an applicant who could possibly profit from submitting a strategically chosen ROL would not generally have the information needed either to know this or to identify the optimal ROL to submit. If an applicant submitted a truncation that was 1 program too short, then that applicant would become unmatched.

The case of residency programs that offer more than 1 position is not so simple, even in simple matches. Programs may, in theory, profit from both truncating their ROLs and reducing the number of positions they submit to the match, either by making early arrangements with some applicants or by withholding positions to be filled by unmatched applicants after the match (T. Sonmez, unpublished data, 1996). [18]

To see how well the theory for simple markets approximates the NRMP and to assess the size of the effects again requires computational experiments. In view of the very small differences observed in the outcomes of the 2 algorithms, it is reasonable to conjecture that, even in the more complex NRMP market, the opportunities for profitable strategic behavior will be very few under either algorithm. The computational experiments that permitted this conjecture to be verified are described in detail elsewhere. [17] These experiments produced upper bounds on the numbers of applicants and programs for whom strategic behavior could even be potentially profitable.

The truncation experiments with applicants' ROLs yield the upper bounds shown in (Table 3). The number of applicants who could even potentially benefit from truncating their ROLs under the former NRMP algorithm in each year is almost exactly equal to the number of applicants who received a preferred match under the applicant-proposing algorithm (Table 2).

Table 3.-Upper Limit of the Number of Applicants Who Could Benefit by Truncating Their Lists at 1 Above Their Original Match Point

Similar results are obtained by computational experiments with truncations of programs' ROLs and by reductions in the number of positions they offer through the match. [17] The number of participants who can even potentially profit from these kinds of strategic behavior is vastly smaller than those who have no such prospects. For both applicants and programs, attempts to alter match results by strategic behavior in stating ROLs or capacities is far more likely to hurt than to help.

WHY THE DIFFERENCES ARE SMALL: INSIGHTS FROM SIMPLE MARKETS ^

The small differences between the 2 algorithms suggests that, in each of the years studied, the set of stable matchings is small, as measured by the number of participants who receive different matches at different stable matchings. [7] Computational experiments on hypothetical markets without match variations can help explain why this is so. For this purpose, consider simple markets with n positions and n applicants, as n varies from the size of small specialty markets and submarkets to the size of the full NRMP match.

When preferences are highly correlated-when similar programs tend to agree which are the most desirable applicants, and applicants tend to agree which are the most desirable programs-the set of stable matchings is small. (For example, if all programs had the same ranking of applicants and all applicants had the same ranking of programs, then the only stable matching would be the one that filled the highest ranked program with the highest ranked applicants, the second program with the next highest group of applicants,) and (so on.) As correlation decreases, the set of stable matchings grows, and more participants would be matched differently by the 2 algorithms, independent of the size of the market.

But the size of the market also plays a critical role. Consider the case of uncorrelated preferences (so the set of stable matchings is large). If every applicant could somehow interview for all positions and rank order them for the match, the percentage of applicants who could get different stable matchings would grow as the number of applicants and positions grew. But in a real market, there is a limit to how many interviews an applicant can go on or a program can conduct. Taking this into account, computational experiments show that the set of stable matchings becomes very small as the market becomes large.

Specifically, let k equal the number of positions a candidate can interview for and put on a ROL, and let n equal the number of applicants and positions. Computational experiments on randomly generated simple markets show that, even when preferences are uncorrelated, if k is held fixed at realistic levels, then the set of stable matchings becomes small as n becomes large. For example, if k=15 (not an unreasonable approximation for the NRMP) and n is greater than 10 000, fewer than 0.1% of applicants would receive a different match from the applicant-proposing and program-proposing algorithms. Even with uncorrelated preferences, this simple market exhibits the same 1 in 1000 order of magnitude seen in the NRMP. For simple markets the size of small specialty matches or submarkets with n of about 100 positions, if applicants rank order no more than 10 programs, only about 2% of applicants receive different matches from the 2 algorithms. Because preferences are not uncorrelated in the medical matches, the orders of magnitude of the effects seen in the actual matches are comparable to simple matches with similar k and n.

This is important for the present study because, in this theoretical exercise, we can look at agents' true preferences whereas in the study of real matches, we use as data the submitted ROLs. A skeptical interpretation of our results could be that perhaps large opportunities for strategic manipulation exist, but these aren't detectable in the ROLs submitted to the match because participants have already behaved strategically in an optimal way. If this hypothesis were correct, we should also find large differences when we look at the size of the set of stable matchings in these hypothetical markets. Instead, we see that simple markets provide an explanation of not only the direction of the effects we have been examining, but also their small size.

CONCLUSION ^

There is now a well-tested applicant-proposing algorithm capable of handling the NRMP match variations (more than 500 matches on ROL data were conducted during this study). Under either algorithm, both applicants and programs can be confidently advised that, when deciding what ROLs to submit, they are extraordinarily unlikely to be able to influence the match in their favor by submitting a list different from their true preferences. This result is supported both by the computational studies conducted on previous matches and by the new results for simple markets.

The choice between the 2 algorithms does not involve consequential differences for the match as a whole except that a small group of applicants can be expected to do better under the applicant-proposing algorithm, and a small number of programs may do worse. The policy decision to adopt the newly designed applicant-proposing algorithm focused on this difference between the 2 algorithms and its impact on applicants and programs.

The work discussed in "Why the Differences Are Small: Insights From Simple Markets" is not part of the study commissioned by the NRMP but part of the academic investigation of theoretical issues motivated by the study. Computational support for that investigation was provided by Aljosa Feldin at the University of Pittsburgh.

Corresponding author: Alvin E. Roth, PhD, Department of Economics, University of Pittsburgh, Pittsburgh, PA 15260 (e-mail: alroth+@pitt.edu).

REFERENCES ^

1. Peranson E, Randlett RR. The NRMP matching algorithm revisited: theory versus practice. Acad Med. 1995;70:477-484. [Context Link]

2. Peranson E, Randlett RR. Comments on Williams' 'A Reexamination of the NRMP Matching Algorithm.. Acad Med 1995;70:490-494. [Context Link]

3. Williams KJ. A reexamination of the NRMP matching algorithm. Acad Med. 1995;70:470-476. [Context Link]

4. Williams, KJ. Comments on Peranson and Randlett's 'The NRMP Matching Algorithm Revisited: Theory versus Practice.. Acad Med 1995;70:485-489. [Context Link]

5. American Medical Student Association and Public Citizen Health Research Group. Report on hospital bias in the NRMP [American Medical Student Association site]. September 1995. Available at: http://pubweb.acns.nwu.edu/approximately alan/nrmp2.html. Accessed August 1, 1997. [Context Link]

6. American Medical Association-Medical Student Section. American Medical Association-Medical Student Section resolutions for the 1995 interim meeting [Texas Medical Association Web site]. Available at: http://www.bcm.tmc.edu/ama-mss/i95res.htm#11. Accessed August 1, 1997. [Context Link]

7. Roth AE, Sotomayor M. Two-Sided Matching: A Study in Game-Theoretic Modeling and Analysis. New York, NY: Cambridge University Press; 1990. [Context Link]

8. Roth AE. The evolution of the labor market for medical interns and residents: a case study in game theory. J Political Economy. 1984;92:991-1016. [Context Link]

9. Roth AE. New physicians: a natural experiment in market organization. Science. 1990;250:1524-1528. [Context Link]

10. Roth AE. The National Resident Matching Program as a labor market. JAMA. 1996;275:1054-1056. Ovid Full Text [Context Link]

11. Gale D, Shapley L. College admissions and the stability of marriage. Am Math Mon. 1962;69:9-15. [Context Link]

12. Roth A. A natural experiment in the organization of entry level labor markets: regional markets for new physicians and surgeons in the UK. Am Econ Rev. 1991;81:415-440. [Context Link]

13. Roth A, Xing X. Jumping the gun: imperfections and institutions related to the timing of market transactions. Am Econ Rev. 1994;84:992-1044. [Context Link]

14. Aldershof B, Carducci OM. Stable matchings with couples. Discrete Appl Math. 1996;68:203-207. [Context Link]

15. Roth A. Interim report 1: evaluation of the current NRMP algorithm, and preliminary design of an applicant-processing algorithm [design review of the National Resident Matching Program home page]. March 11, 1996. Available at: http://www.pitt.edu/approximately alroth/interim1.html. Accessed August 1, 1997. [Context Link]

16. Roth AE, Vande Vate JH. Random paths to stability in 2-sided matching. Econometrica. 1990;58:1475-1480. [Context Link]

17. Roth AE. Report on the design and testing of an applicant proposing matching algorithm, and comparison with the existing NRMP algorithm [design review of the National Resident Matching Program home page]. December 6, 1996. Available at: http://www.pitt.edu/approximately alroth/phase1.html. Accessed August 1, 1997. [Context Link]

18. Sonmez T. Manipulation via capacities in two-sided matching markets. J Econ Theory. In press. [Context Link]

Algorithms; Computers; Education, Medical, Graduate; Internship and Residency; National Resident Matching Program

Accession Number: 00005407-199709030-00026