Stable Matching Problem

6 minute read

The Stable Matching Problem (SMP) is a classic mathematics problem that involves combinatorial theory of ordered sets. Though SMP was initially described in the context of marriage, it has applications in other fields such as matching medical students to residency programs and college admissions. The National Resident Matching Program (NRMP) and the Canadian Resident Matching Service (CaRMS) are two examples of residency admission algorithms that aim to optimize SMP.

In the example of marriage, the problem involves two equal sized sets of women and men. Each woman and man is asked to list members of the opposite sex in order of preference. For example:

The challenge is to pair men and women in a way that takes into account their preferences. A matching is deemed “unstable” if:

  • woman A and man A are paired but woman A prefers man B and
  • man B is paired with woman B but prefers woman A over woman B

This would be considered an unstable marriage since woman A and man B simultaneously prefer each other over their respective matches. The goal is to configure optimal assignments such that all matchings are stable.

In 1962, David Gale and Lloyd Shapely developed an algorithm to solve this problem. The algorithm works under the following assumptions:

  • Women and men are binary and heterosexual
  • Women can only propose to men, not vice versa.
  • If a woman proposes to a man and that man is available, then the woman and man become engaged.
  • If a man gets multiple proposals, he rejects all but his top choice.
  • A rejected woman proposes to her next best choice regardless of whether that man is currently engaged.
  • If an engaged man receives a proposal from a woman that he prefers over his current woman, he breaks his proposal and becomes engaged to his preferred choice.
  • These steps are repeated until all proposals (“matchings”) are stable.

The code below is an implementation of the Gale-Shapely algorithm in Python. We have 5 women (Juliette, Maeve, Sadie, Libby, Bianca) and 5 men (Romeo, Baxter, Asher, Frank, Leo), and our goal is to pair them in such a way that all matchings are stable.

Here are the ordered preferences for women and men, respectively:

women = {"Juliette":["Romeo", "Baxter", "Frank", "Leo", "Asher"],
         "Maeve":["Baxter", "Leo", "Asher", "Romeo", "Frank"],
         "Sadie":["Baxter", "Asher", "Leo", "Frank", "Romeo"],
         "Libby": ["Leo", "Frank","Romeo","Asher","Baxter"],
         "Bianca":["Asher", "Baxter", "Frank", "Romeo","Leo"]}

men = {"Romeo":["Libby", "Maeve","Sadie","Bianca", "Juliette"],
       "Baxter":["Juliette", "Maeve", "Bianca", "Sadie","Libby"],
       "Asher": ["Sadie", "Libby", "Bianca", "Juliette", "Maeve"],
       "Frank": ["Juliette", "Bianca","Libby","Maeve","Bianca"],
       "Leo":["Juliette", "Libby", "Bianca", "Sadie","Maeve"]}

The function below is built around assumptions defined by Gale-Shapely’s solution to find optimal matchings between “suitors” (makes the proposal) and “reviewers” (receives the proposal).

def matchmaker(suitor_picks, reviewer_picks):
    suitors = list(suitor_picks.keys())
    matching = {s: None for s in suitors}
    
    while suitors:
        s = suitors.pop(0)
        r = suitor_picks[s][0]
        if r not in matching.values():
            matching[s] = r
        else:
            for suitor, reviewer in matching.items():
                if reviewer == r:
                    r_partner = suitor
            if reviewer_picks[r].index(s) < reviewer_picks[r].index(r_partner):
                matching[r_partner] = None
                matching[s] = r
                suitors.append(r_partner)
            else:
                suitor_picks[s].remove(r)
                suitors.append(s)
    return matching

Let’s test out the function. In the first trial, men are the “suitors” who propose while women are the “reviewers” who can either accept or reject a given suitor’s proposal.

Trial 1: men are the suitors, women are the reviewers

matchmaker(men,women)
{'Romeo': 'Maeve',
 'Baxter': 'Juliette',
 'Asher': 'Sadie',
 'Frank': 'Bianca',
 'Leo': 'Libby'}

In the second trial, the roles are flipped such that women are the ones proposing to men.

Trial 2: women are the suitors, men are the reviewers

matchmaker(women,men)
{'Juliette': 'Romeo',
 'Maeve': 'Baxter',
 'Sadie': 'Asher',
 'Libby': 'Leo',
 'Bianca': 'Frank'}

As shown above, the matchings are different based on whether the men vs. women were “suitors” vs. “reviewers”. In both situations, suitors are assigned their best possible match. A suitor’s final matching might not be their top choice, but this is because their top choice has rejected him or her along the way for a better suitor. Furthermore, reviewers (i.e., the ones receiving the proposal) are assigned their worst possible match, while still maintaining stable matchings. While it may seem like the person receiving the proposal has more choice in their ultimate partner, the Gale-Shapely algorithm demonstrates that the suitor who makes the proposals will have their optimal match.

The beauty of this algorithm is that it will always find stable matchings regardless of population size.

Extending the Gale-Shapely algorithm to medical residency

The medical residency match process is a variant of SMP in which a hospital is “polygamous” and can accept multiple residents at a time. We no longer assume that there are an equal number of residents and hospitals. It’s possible that there are more applicants than residency spots at hospitals. We also no longer assume that all residents will rank all possible hospitals in their preference list, and vice versa.

Let’s say we have 5 applicants (Chen, Feldman, Smith, Gianotti, and Markov) and 4 hospitals (Harvard, Duke, McGill, Stanford), and each hospital has 2 residency spots available.

residents = {'Chen':['Harvard', 'Stanford','Duke'],
            'Feldman':['Duke','Stanford','Harvard'],
            'Smith':['Harvard','Stanford','McGill'],
            'Gianotti':['Stanford'],
            'Markov':['Harvard','McGill','Stanford'],
            'Anderson':['McGill','Duke','Harvard']}

hospitals = {'Harvard':['Chen','Gianotti','Feldman'],
            'Duke':['Feldman','Smith','Anderson'],
            'McGill':['Gianotti','Chen','Markov'],
            'Stanford':['Smith','Anderson','Chen']}

In this scenario, residents are the “suitors” and hospitals are the “reviewers”. The algorithm works as follows:

  • If a resident is not ranked by a given hospital, remove that hospital from the preference list of the resident
  • Each resident “proposes” to their first choice hospital
  • If a hospital has multiple offers, the hospital rejects all but its first choice
  • Each rejected resident then proposes to their next top choice
  • If an occupied hospital recieves an offer from a new resident that is higher on the list than its current resident, then the hospital breaks its original offer and becomes matched to its preferred choice.

These steps are almost identical to the original Gale-Shapely algorithm:

import numpy as np

def residency_matching(resident_picks, hospital_picks, capacities):
    residents = list(resident_picks.keys())
    resident_matching = {r: None for r in resident_picks.keys()}
    hospital_matching = {h: [] for h in hospital_picks.keys()}

    while residents:
        r = residents.pop(0)
        # let r_p be the preferences of resident r
        r_p = resident_picks[r]
        while r_p and (not resident_matching[r]):
            if r not in hospital_picks[r_p[0]]:
                r_p.remove(r_p[0])
            if r_p:
                h = r_p[0]
                # let h_p be the preferences of hospital h
                h_p = hospital_picks[h] 
                # let h_matches be the matched residents for hospital h 
                h_matches = hospital_matching[h]
                if len(h_matches) < capacities[h]:
                    resident_matching[r] = h
                    hospital_matching[h] += [r]
                else:
                    # r_rank is a given resident's rank within a hospital's preference list
                    r_rank = h_p.index(r)
                    worst_rank = np.max([h_p.index(i) for i in h_p if i in h_matches])
                    worst_match = h_p[worst_rank]
                    if r_rank < worst_rank:
                        hospital_matching[worst_match] = None
                        h_matches.remove(worst_match)
                        resident_picks[worst_match].remove(r)
                        residents.append(worst_match)
                        hospital_matching[r] = h
                        h_matches += [r]
                    else:
                        h_p.remove(r)
                        r_p.remove(h)

    return resident_matching, hospital_matching

Let’s assume that each hospital has 2 residency spots.

capacities = {'Harvard' : 2, 'Duke': 2, 'McGill': 2, 'Stanford': 2}
resident_matches, hospital_matches = residency_matching(residents, hospitals, capacities)

residency_matching() returns two dictionaries: one that describes a resident’s final assignment and the other that describes a hospital’s assigned residents. Here are the results:

resident_matches
{'Chen': 'Harvard',
 'Feldman': 'Duke',
 'Smith': 'Stanford',
 'Gianotti': None,
 'Markov': 'McGill',
 'Anderson': 'Duke'}
hospital_matches
{'Harvard': ['Chen'],
 'Duke': ['Feldman', 'Anderson'],
 'McGill': ['Markov'],
 'Stanford': ['Smith']}

Note that in this algorithm, the residents are the “suitors” so ultimately, the final results will reflect their best possible match. That being said, if a resident has ranked a set of hospitals but those hospitals have not reciprocated, this means that the resident’s preference list is empty. As a result, this resident will go “unmatched”. For example, Gianotti only ranked Stanford in his preference list but Stanford did not list Gianotti. Though Harvard and McGill have ranked Gianotti and have capacity for him, Gianotti is ultimately left without an assigned hospital. In a typical admissions process, there is a second round of the algorithm in which unmatched residents have a second chance at matching with hospitals that have available residency spots. Despite this second round, the number of residents has surpassed hospital capacity in Canada, which means that some residents will always go unmatched each year (1, 2, 3).