Explanation of this Archive's Level of Confidence Rating (which ranges from 1-6)
The best/highest rating is a 1. This represents a data file which contains a set of scores that has been compared against someone else's data file for that same year. Kenneth Massey's page (http://www.masseyratings.com/data.php) has links to the pages containing the five most recent years contained in this archive (in a more user-friendly format) as well as the years after 2000. Below you will find a table that summarizes the differences I found between his and my scores, and as you can see, there were less than 1% of the scores I had entered that were erroneous (and I would've categorized those five years with a confidence rating of 3 before this comparison was done, so 1 and 2 are even "better"). From the 1979-80 season up to 1999-2000, that year's schedule of games was entered first from one Official NCAA record/guide book, and then the scores were taken from the subsequent year's book, and placed into that schedule. (This implies that a game that was rescheduled may have an incorrect date listed.) For years from 1949-50 up to 1978-79, I have utilized information I collected from the individual institutions themselves (see #6 below) to create the initial data file, and then filled in missing scores/dates, and validated many of those scores that were already entered from date contained in the NCAA books that I've purchased over the years. Of course, if you detect any errors, please let know firstname.lastname@example.org;I will try and determine if these data files should be updated accordingly (as soon as it is feasible for me to do so).
Loren Maxwell has been very helpful to me, with regards to this project, and he has found on-line information from newspaper sources to determine which score is correct, when the NCAA guides and the institution's yearbooks have occasionally disagreed. Loren also helped to determine about 400 of the dates for scores played during the 1949-50 season, which saved me a lot of time. This was necessary because all NCAA guides for the years following this one (1949-50) include the next year's schedule, but for the 1949-50 season, and all guide books before it, no schedules are provided in this NCAA publication, therefore, I relied on each college's sports information packet (that they graciously mailed to me upon my request) to seed my data files with "dated scores", maintaining another file for those games without dates, since all packets did not have dates attached to those athletic contests. (If I ever go back and try to complete the years from 1935-36 to 1949-49, those files will be categorized with a confidence rating of 6.) I hope to continue working from the 1949-50 forward now, until all the "holes" in this archive are filled, and then, perhaps I will go return to those years prior to 1950.
Starting in the early 80s (and before), the NCAA books did not always list the dates for all of the post-season conference tournaments, so I had to guess when those games were actually played. They still appear in the correct chronological order in the data files, but they may be off by a few days from when the games were really played.
Here is what each of the ratings "means":
1 - This file has been verified to be identical to another source (currently 1995-96 to 1999-2000).
2 - This file has been carefully validated against the actual total points scored for each team, for that season, as listed in the NCAA record/guide book for that year, so I would expect a less than 1% error rate in this file.
3 - The results in this file have been compared against the Won/Loss records as posted in Jeff Sagarin's end of season ratings that appear in the USA Today. (Files rated as '1' above would have been in this category before Massey's data became known to me.) This covers 1984-85 to 1994-95 (and those rated as a 1 as well). And like #4 below, these files have roughly 98% or more scores entered exactly as they occurred.)
4 - Years where the W/L records were carefully scrutinized against those posted in the NCAA results books, excluding games against non-Division I opponents, and where many of the scores were individually compared as well, fit this category. These files probably have more than 98% of all the scores being exactly correct; this level of confidence has been determined by the prior experiences checking many other first rounds of data entry against the final, validated version. Several examples are included in the table below these rating explanations.
5 - This group of data files has been validated against the NCAA results books, but the dates for all of the games included was not available, so the first portion of the raw data file includes those games without a known game date.
6 - For these years, I am relying upon the scores that I received from the Sports Information Directors from the over 300 institutions of higher learning that sent me the score portion of their men's basketball media guide, many of which included the dates of the contests. These have been entered as carefully as humanly possible, but some teams will not have all their games, as some of these institutions rebuffed my requests 4 or 5 times. However, since I have so many of them (over 90% in total, including all the "major", and almost all of the "mid-majors" scores), only games where two teams who played against each other, and I have neither of their "media guide/archive results pages", will not be found in this type of file; the total number of such games should be less than 5% of the entire number of games that were played I would expect, for any of these years, and many should be closer to having only 2-3% in absentia. I would also expect at least 95% of the scores to be correct.
Below you will find a table quantifying how many errors were found when validating several years of inputted scores against the totals found in the NCAA guide books:
|Year||# teams||# games/scores||#scores incorrect
|# games missing
For the 1957-58 season, 20 of the incorrect scores were wrong by 3 points or less, 20 more were off by 4-10 points, and the other 8 differences were 11, 12, 14(4 times), 17 and 19 points respectively, and 86 of the 167 teams in the 1956-57 season didn't have any score modified, added or removed during the validation step. Similar results were uncovered during this validation process for the other years as well.
Back to Prof. Trono's Home page. (This page last modified July 28, 2011 .)