The Discrete Rating System

I think it was in the late 1980s that I began to contemplate the basic idea behind this methodology. From my experience, it seemed that many systems which rate athletic teams tend to be overly influenced by large score differences. However, if a team loses a game it wasn't suppose to, the overall impression about that team's ability should only decrease slightly, unless the outcome is because a key player was injured, or some other similar piece of data that most numerical based systems can not take into account.

So I postulated that each team should be assigned an integer rating, and a predicted score differential would be generated before each game from those ratings, and if the actual result of the game was close to that prediction, then the ratings for those two teams would be considered reasonably accurate. If not, they would be modified by only +1/-1, depending upon if the team did better/worse than expected. I experimented with this idea, first analyzing professional football data from previous seasons of the NFL. I later switched my focus to college football, utilizing all the games between what I considered the major college programs from 1965 to 1990 to determine what the threshold should be when determining if the actual score was too far away from the prediction, and how many points each integer increment in a team's rating should be worth.

For example, let's take the game where Ohio State traveled to Michigan for the last game of each team's regular season in 2003. Michigan was rated as an 8, and Ohio State was rated as a 5. Assuming each integer represented a 3 point measure of that team's abilities, and then including the 3 point home field advantage, Michigan would be favored by this system to win by that game by 12 points, and the actual score was 35-21. Given that result, it is fairly obvious that neither team's rating should be updated, as that prediction's level of accuracy is very reasonable.

Using the same 3 point per integer rating value (and "secret" threshold), this system usually predicts about 70-78% of the entire season's games correctly, using the final ratings for a team from one year to begin the next. (Also, the home field advantage is 3.0001, so that a visiting team must have a rating that is 2 larger than the home team to be favored to win.) Of course, it takes a few years for the ratings to migrate towards where they should be, but by starting with the games in 1965, I don't think there would be much difference in the initial team ratings for 2004 if I went back and redid them by using the data from a few years earlier or later than 1965. (I hope to investigate this sort or "retroactive study" if and when time permits.)

Because many teams end up with the same rating, it was previously hard to consider how to incorporate these ratings to uniquely rank the teams like the two human polls (AP and USA Today) do. However, I had an idea come to me on the weekend of 10/23/2004, and so I tried it and am reasonably happy that it does reflect the relative strength of the teams as well as being a reasonable measure for a team's level of success that year. Basically, it simply uses the team's integer rating, subtracts the square root of the score difference for each loss, (half a point for a tie, when considering data before 1996 when overtime was enacted to break such ties) and then multiplies this quantity by 3 (the points per integer rating increment) and finally adds 100 to place most team's ratings above zero. (If a tie occurs using this ranking strategy, there is a simply, similar methodology to employ that only uses that year's games to determine a team's rating, and that value, after being divided by 100, is adding to the two teams whose ratings are equal, to break the tie.) The ranking for NCAA Division 1-A college football can be found here.