Massey Centrality

In 1997, Kenneth Massey, then an undergraduate, created a method for ranking college football teams. We wrote about this method, which uses the mathematical theory of least squares, as his honors thesis.

Math

The main idea of Massey's method is enclosed in the following equation: $$r_i - r_j = y_k$$ where $r_i$ and $r_j$ are the ratings of teams $i$ and $j$ and $y_k$ is the absolute margin of victory for game $k$ between teams $i$ and $j$. If there are $n$ teams who played $m$ games, we have a linear system: \begin{equation} \label{Massey1} X r = y \end{equation} where $X$ is a $m \times n$ matrix such the k-th row of $X$ contains all 0s with the exception of a 1 in location $i$ and a $-1$ in location $j$, meaning that team $i$ beat team $j$ in match $k$ (if match $k$ ends with a draw, either $i$ or $j$ location can be assigned $1$, and the other $-1$). Since typically $m >> n$, this system is highly overdetermined and inconsistent.

Massey proposes to find the solution $r$ that minimizes: $$ ||Xr - y||_2 $$ where $||\cdot||_2$ is the norm 2, that is $||x||_2 = \sqrt{\sum_i x_{i}^{2}}$. Let $$M = X^T X$$ and $$p = X^T y.$$ Notice that \begin{equation*} M_{i,j} = \left\{ \begin{array}{ll} \text{the negation of the # of matches between } i \text{ and } j & \text{if } i \neq j,\\ \text{# of games played by } i & \text{if } i = j. \end{array} \right. \end{equation*} and $p_i$ is the signed sum of point spreads of every game played by $i$. The Massey's method is then defined by the following linear system: \begin{equation} \label{Massey2} M r = p \end{equation} The solution of the above system to the solution that minimizes $||X r - y||_2$.

We observe that the Massey's team ratings are in fact interdependent. Indeed, Massey's matrix $M$ can be decomposed as: $$M = D - A$$ where $D$ is a diagonal matrix with $D_{i,i}$ equal to the number of games played by team $i$, and $A$ is a matrix with $A_{i,j}$ equal to the number of matches played by team $i$ against team $j$. Hence, Massey's linear system is equivalent to: \begin{equation} \label{Massey3} Dr - Ar = p \end{equation} or, equivalently: \begin{equation} \label{Massey4} r = D^{-1} (A r + p) = D^{-1} A r + D^{-1} p \end{equation} That is, for any team $i$: $$r_i = \frac{1}{D_{i,i}} \sum_j A_{i,j} r_j + \frac{p_i}{D_{i,i}}$$ This means that the rating $r_i$ of team $i$ is the sum $r^{(1)}_{i} + r^{(2)}_{i}$ of two meaningful components:

  1. the mean current rating of teams that $i$ has matched: $$r^{(1)}_{i} = \frac{1}{D_{i,i}} \sum_j A_{i,j} r_j;$$
  2. the mean point spread of team $i$: $$r^{(2)}_{i} = \frac{p_i}{D_{i,i}}.$$
Finally, it is interesting to analyse what happens to Massey's system at the end of the season, assuming that all $n$ teams matched all other teams once. In this case, the opponents rating component $$r^{(1)}_{i} = -\frac{r_i}{n-1},$$ where we have used the fact that $\sum_i r_i = 0$, and the point spread component $$r^{(2)}_{i} = \frac{p_i}{n-1},$$ hence $$r_i = r^{(1)}_{i} + r^{(2)}_{i} = -\frac{r_i}{n-1} + \frac{p_i}{n-1},$$ and thus $$r_i = \frac{p_i}{n}.$$ The same result is obtained noticing that $$M p = (D-A) p = (n-1) p - A p = np - (p + Ap) = np$$ where we have used the fact that $p$ sums to 0. Hence $p$ is an eigenvector of $M$ with eigenvalue $n$ and hence, once again, $r = p/n$ is a solution of the Massey's system. Hence, the final rating of a team is proportional to the final point spread of the team.

We now establish a connection between Massey's method and Katz's centrality. Observe that Massey's equation $$r = D^{-1} (A r + p) $$ is close to the equation defining Katz's centrality $$r = \alpha A r + \beta$$ where $\alpha$ is a given constant and $\beta$ is a given vector. The Katz's equation can be solved by inverting matrix $I - \alpha A$ as soon as $0 < \alpha < 1/\rho(A)$, with $\rho(A)$ the spectral radius of $A$. In particular, making the reasonable assumption that all teams played the same number $k$ of matches, then Massey's equation simplifies:

\begin{equation} \label{Massey5} r = \frac{1}{k} (A r + p) \end{equation} which corresponds to Katz's centrality with parameters $\alpha = 1/k$ and $\beta = p/k$. Nevertheless, notice that $\rho(A) = k$ hence $\alpha = 1/\rho(A)$. Massey's equation is an instance of Katz's centrality, but cannot be computed by inverting matrix $I - \alpha A$.

Massey's rating has also an electrical interpretation in terms of resistor networks. If we view the symmetric matrix $A$ as the adjacency matrix of an undirected graph $G_A$, then $M$ is the Laplacian matrix of the graph $G_A$. The Massey's rating vector $r$ defined in system $Mr = (D-A)r = p$ is then equivalent to the potential vector over a resistor network defined by $A$ with supply vector $p$.

Following this metaphor, the resistor network has a resistor between nodes $i$ and $j$ as soon as teams $i$ and $j$ has matched with conductance (reciprocal of resistance) equal to the number $A_{i,j}$ of matches between $i$ and $j$. Sources, that are nodes $i$ with positive supply $p_i$ through which current enters the network, are teams with positive point spread, while targets, that are nodes $i$ with negative supply $p_i$ through which current leaves the network, are teams with negative point spread. Notice that current entering and leaving the network must be equal, indeed the point spread vector $p$ sums to 0. The potential $r_i$ for node $i$ corresponds to the rating of team $i$: teams with large potential are teams high in the ranking. Furthermore, the current flow through edge $(i,j)$ is, by Ohm's law, the quantity $A_{i,j} (r_i - r_j)$, which corresponds to the rating difference between teams $i$ and $j$ (which is also an estimate of the point spread in a match between $i$ and $j$) multiplied by the number of times they matched: current flows more intensely between teams of different strengths (as measured by Massey's method) that matched many times.

How do we solve the Massey's system $M r = p$? Unfortunately, $M = D - A$ is a singular matrix, hence it is not possible to solve this system computing the inverse of $M$. Assuming that $G_A$ is connected, a solution of the system can be obtained as follows: let $\overline{M}$ be the matrix obtained by replacing any row, say the last, of $M$ with a vector $e$ of all $1$'s, and let $\overline{p}$ be the vector obtained by replacing the last element of $p$ with a 0. Then, solve the system:

\begin{equation} \label{Massey2.1} \overline{M} r = \overline{p} \end{equation} Notice that the added constraint forced the ranking vector $r$ to sum to 0.

Offensive and defensive ratings

Massey also proposed an offensive rating ($o$) and a defensive rating ($d$) characterizing the offensive and defensive strengths of teams. Massey assumes that $r = o + d$, that is, the overall strength of a team is the sum of its offensive and defensive powers. Let's decompose the point spread vector $p = f - a$, where $f$ holds the total number of points scored by each team and $a$ holds the total number of points scored against each team. Then, the equations defining $o$ and $d$ are:

\begin{eqnarray} Do - Ad & = & f \\ Ao - Dd & = & a \end{eqnarray} That is, for any team $i$: \begin{eqnarray} o_i & = & \frac{1}{D_{i,i}} (\sum_j A_{i,j} d_j + f_i) \nonumber \\ d_i & = & \frac{1}{D_{i,i}} (\sum_j A_{i,j} o_j - a_i) \nonumber \end{eqnarray} This means that the offensive rating of team $i$ multiplied by the number of games played by $i$ is equal to the defensive ratings of opponents of $i$ plus the number of points scored by $i$. On the other hand, the defensive rating of team $i$ multiplied by the number of games played by $i$ is equal to the offensive ratings of opponents of $i$ minus the number of points scored against $i$. Since we know that $r = o + d$, we have that, knowing $r$, the defensive rating $d$ can be computed as: \begin{equation} \label{MasseyOD} (D + A) d = D r - f \end{equation} and, knowing $d$ and $r$, the offensive rating $$o = r - d.$$

Let $N = D + A$ and $q = Dr - f$. It might happen that matrix $N$ is singular, hence it cannot be inverted. Assuming that $G_A$ is connected, it holds that $N$ is singular precisely when the graph $G_A$ of $A$ is bipartite. If $G_A$ is bipartite, we can apply a perturbation trick similar to the one exploited for the Massey's matrix $M = D - A$. Let $V_1$ and $V_2$ be the two set of nodes of the bipartite graph $G_A$. Let $v$ be a vector such that $v_i = 1$ if $i \in V_1$ and $v_i = -1$ if $i \in V_2$. Let $\overline{N}$ be the matrix obtained by replacing any row, say the last, of $N$ with vector $v$ and $\overline{q}$ be the vector obtained by replacing the last element of $q$ with a 0. Then, a solution of system Massey's system for defensive ranking is obtained solving the following perturbed system:

\begin{equation} \label{MasseyOD.1} \overline{N} d = \overline{q} \end{equation}

Code

The following user-defined function massey computes the various Massey's ratings described above:

## Massey # ASSUMPTION: graph of A is connected # INPUT # matches: matches data frame # teams: team names # OUTPUT # r: Massey rating (r = r1 + r2; r = o + d) # r1: opponents rating # r2: spread rating # o: offensive rating # d: defensive rating # A: match matrix # g: match graph massey = function(matches, teams) { # number of teams n = length(teams) # number of matches m = dim(matches)[1] # match matrix X = matrix(0, nrow=m, ncol=n, byrow=TRUE) # margin of victory vector y = rep(0, m) # points for vector f = rep(0, n) # points against vector a = rep(0, n) # populate match matrix for (i in 1:m) { if (matches[i,"score1"] >= matches[i,"score2"]) { X[i, matches[i,"team1"]] = 1; X[i, matches[i,"team2"]] = -1; y[i] = matches[i,"score1"] - matches[i,"score2"] } else { X[i, matches[i,"team2"]] = 1; X[i, matches[i,"team1"]] = -1; y[i] = matches[i,"score2"] - matches[i,"score1"] } f[matches[i,"team1"]] = f[matches[i,"team1"]] + matches[i,"score1"] f[matches[i,"team2"]] = f[matches[i,"team2"]] + matches[i,"score2"] a[matches[i,"team1"]] = a[matches[i,"team1"]] + matches[i,"score2"] a[matches[i,"team2"]] = a[matches[i,"team2"]] + matches[i,"score1"] } # compute Massey matrix M = t(X) %*% X # compute team spread vector (or p = f-a) p = t(X) %*% y # check assumptions D = diag(diag(M), n, n) A = D - M g = graph_from_adjacency_matrix(A, mode="undirected", weighted=TRUE) if (!is.connected(g)) { cat("The graph is not connected"); return(NULL) } # perturbate the system M1 = M p1 = p M1[n, ] = rep(1, n) p1[n,1] = 0 # compute ranking r r = solve(M1, p1) # compute partial rankings r1 and r2 (r = r1 + r2) D1 = solve(D) r1 = D1 %*% A %*% r r2 = D1 %*% p # compute offensive and defensive rankings (r = o + d) N = D + A q = (D %*% r) - f # check if graph of A is bipartite (we know that the graph of A is connected) bipa = bipartite_mapping(graph_from_adjacency_matrix(A, mode="undirected")) if (bipa$res == TRUE) { # if bipartite, perturbate the system v = bipa$type v[v == TRUE] = 1 v[v == FALSE] = -1 N1 = N q1 = q N1[n, ] = v q1[n,1] = 0 d = solve(N1, q1) } else { d = solve(N, q) } o = r - d r = as.vector(r) r1 = as.vector(r1) r2 = as.vector(r2) o = as.vector(o) d = as.vector(d) names(r) = teams names(r1) = teams names(r2) = teams names(o) = teams names(d) = teams V(g)$teams = teams return(list(r=r, r1=r1, r2=r2, o=o, d=d, A = A, g = g)) }

Example

The table below shows the results of 4 matches (numbered 1, 2, 3, 4) involving 4 fictitious teams (labelled A, B, C, D): $$ \begin{array}{ccccc} match & team_1 & team_2 & score_1 & score_2 \\ \hline 1 & A & C & 2 & 0 \\ 2 & A & D & 3 & 0 \\ 3 & B & C & 1 & 1 \\ 4 & B & D & 2 & 1 \\ \end{array} $$ The match-team matrix $X$ is given below: $$ \begin{array}{l|rrrr} & A & B & C & D \\ \hline 1 & 1 & 0 & -1 & 0 \\ 2 & 1 & 0 & 0 & -1 \\ 3 & 0 & 1 & -1 & 0 \\ 4 & 0 & 1 & 0 & -1 \\ \end{array} $$ The match spread vector $y$ is: $$ \begin{array}{c|r} team & y \\ \hline 1 & 2 \\ 2 & 3 \\ 3 & 0 \\ 4 & 1 \\ \end{array} $$ The Massey's matrix $M = X^T X$ is: $$ \begin{array}{l|rrrr} & A & B & C & D \\ \hline A & 2 & 0 & -1 & -1 \\ B & 0 & 2 & -1 & -1 \\ C & -1 & -1 & 2 & 0 \\ D & -1 & -1 & 0 & 2 \\ \end{array} $$ and the team spread vector $p = X^T y$ is: $$ \begin{array}{c|r} team & p \\ \hline A & 5 \\ B & 1 \\ C & -2 \\ D & -4 \\ \end{array} $$ The resulting Massey's rating is the following: $$ \begin{array}{c|r} team & r \\ \hline A & 1.75 \\ B & -0.25 \\ C & -0.25 \\ D & -1.25 \\ \end{array} $$ Notice that teams B and C have the same rating (and hence ranking), despite the point spread of B (1) is higher than the point spread of C (-2). This is because B against C ended in a draw, B won the second match, but against the weakest team (D), while C lost the second match, but against the strongest team (A). Indeed, the rating $r$ can be decomposed in the partial ratings $r^{(1)}_{i}$ and $r^{(2)}_{i}$ that follows: $$ \begin{array}{c|c|c|c} team & r & r^{(1)}_{i} & r^{(2)}_{i} \\ \hline A & 1.75 & -0.75 & 2.5 \\ B & -0.25 & -0.75 & 0.5 \\ C & -0.25 & 0.75 & -1.0 \\ D & -1.25 & 0.75 & -2.0 \\ \end{array} $$ Hence, $B$ has a better point spread but an easier schedule with respect to $C$. Summing the two rating componentes we get the same rating for teams $B$ and $C$. Moreover, the rating $r$ can be decomposed in the offensive and defensive ratings that follows: $$ \begin{array}{c|c|c|c} team & r & o & d \\ \hline A & 1.75 & 1.875 & -0.125 \\ B & -0.25 & 0.875 & -1.125 \\ C & -0.25 & -0.125 & -0.125 \\ D & -1.25 & -0.125 & -1.125 \\ \end{array} $$ where the points for and against vectors are as follows: $$ \begin{array}{c|c|c|c} team & p & f & a \\ \hline A & 5 & 5 & 0 \\ B & 1 & 3 & 2 \\ C & -2 & 1 & 3 \\ D & -4 & 1 & 5 \\ \end{array} $$ Notice that team $B$ has a better offensive rating than $C$, but a worse defensive rating, and the sum of offensive and defensive powers is the same for both teams. Also, A and C have the same defensive rating, although A has 0 points against and C has 3 points again; again, this because the schedule of C is harder than the schedule of A.