Tuesday, March 18, 2008

Using The Past To Submit A Sensical Bracket

I posted this last year- here is the same info updated to include the 2007 tournament.

This is a little extremely wonky, so some of you will be interested and some will not, but what the hell. As a former hard core college basketball junkie and wonk, I figured the best I can offer this year is to share some of this info with you, as it may help you tweak your bracket before you submit it.

Disclaimer 1: this data should never be used to fill out a bracket; the bracket should always be filled out first using whatever methodology you use. This data can be helpful in perhaps going back and tweaking a few picks here and there to ensure that what you picked falls with the normal range over time of what has actually occurred, so that you don't submit something whose odds of happening are minuscule, thus reducing your chances of taking your co-workers money.

Disclaimer 2: As you will see when we get into the numbers below, 2007 was quite an anomaly; what in horse racing handicapping might be referred to as a "throw away" race because its result was such an outlier compared to the rest of the data. Anyhow, it can be looked at two ways: (1) that it makes all the rest of the data null and void and shows that truly anything can happen, or (2) that it was an anomaly and that over time the trends of the rest of the historical data will largely hold true.

Now then...

FIRST ROUND

First Round Upsets

Year 15v2 14v3 13v4 12v5 11v6 10v7
2007 0 0 0 0 2 0
2006 0 1 1 2 2 2
2005 0 1 1 1 1 1
2004 0 0 0 2 0 1
2003 0 0 1 1 1 2
2002 0 0 1 3 2 1
2001 1 0 2 2 2 2
2000 0 0 0 0 1 2
1999 0 1 1 2 0 4
1998 0 1 1 1 1 3
1997 1 1 0 1 0 2
1996 0 0 1 2 1 2
1995 0 2 1 1 1 1
1994 0 0 0 2 1 2
1993 1 0 1 1 1 0
1992 0 1 1 1 0 2
1991 1 1 1 1 2 2

Now look over your brackets and just compare the numbers that you have with this historical data. There is no need to adjust anything unless what you have is waaaaay out of range. I just try to make sure that I'm not way out of bounds. Last year was very unusual in that it was only the 2nd time that all of the 2, 3, 4 and 5 seeds advanced to the 2nd round since the field expanded to 64 teams in 1985, so in 21 of the past 23 years there have been upsets over 5, 4, 3, and 2 seeds and we should expect them in any given year.

As most folks know, a 16 has never beaten a 1 seed. A 2 seed has only lost 4 times in the last 17 years, and never 2 in the same year, so for example if you have three different 2 seeds losing to 15 seeds, then you might want to adjust your picks and play the percentages. The 5/12 games get alot of publicity, but don't forget that a 13 seed beats a 4 seed more years than not, and the 11 seeds do pretty well against the 6 seeds. The 8v9 games are not listed because to me a 9 seed beating and 8 seed isn't really an upset. Again, just be sure that you are somewhere in the ballpark, or even on the edge or just a little out of bounds, but if you have all four 11 seeds beating the 6 seeds you may want to adjust. Hell it may happen this year, but if I was in an office pool I would want to maximize my percentages against the others and save the totally wacky-ass bullshit for talk at the sportsbar.

I typically compare my filled out bracket to these historical numbers to make sure that I am in/near the realm of what has happened on each matchup column.

SECOND ROUND

#1 Seed Losing In The Second Round
Year Number
2007 0
2006 0
2005 0
2004 2
2003 0
2002 0
2001 0
2000 2
1999 0
1998 1
1997 0
1996 1
1995 0
1994 0
1993 0
1992 1
1991 0

All four 1 seeds get through to the 2nd weekend more often than not, so if you have this then there is no need to worry. But if you have one, or even have the stones to pick two #1 seeds to lose to the winner of the 8v9 game then the data says you are not crazy and to rock on with your bad self. If you have 3 or all 4 #1 seeds going out in the 2nd round, you may want to think about adjusting your picks.

#2 Seed Losing In The Second Round
Year Number
2007 1
2006 2
2005 2
2004 2
2003 2
2002 1
2001 2
2000 3
1999 3
1998 1
1997 2
1996 0
1995 0
1994 1
1993 2
1992 1
1991 1

As you can see, the #2 seeds fare much worse than the #1 seeds. I won't get into why I think this is so, but you may want to think about having a 2 seed eat it in the second round, as all four haven't made it through to the Sweet Sixteen in over a decade.

THE SWEET SIXTEEN

Seeds Below 5 In the Sweet 16
Year Number Seeds
2007 2 6, 7
2006 5 6,7,7,11,13
2005 6 6,6,6,7,10,12
2004 5 6,7,8,9,10
2003 4 6,7,10,12
2002 5 6,8,10,11,12
2001 6 5,6,7,10,11,12
2000 8 6,6,6,7,8,8,10,10
1999 7 6,6,10,10,10,12,13
1998 5 6,6,8,10,13
1997 6 6,6,6,10,10,14
1996 3 6,8,12
1995 3 6,6,6
1994 4 6,9,10,12
1993 5 6,6,7,7,12
1992 5 6,6,7,9,12
1991 3 10,11,12


Look at your brackets and count up the number of seeds below #5 seeds that you have in the Sweet Sixteen. Last year was extremely chalky with top seeds advancing, and only 2 seeds below 5 made it to the 2nd weekend. In the 16 years before last year this number has always been between 3 and 8, so ideally you would want to be in this range, but if you have 2 or 9 then I wouldn't worry too much. I also have listed the breakdown of the seeds that advanced each year in addition to the number of teams that made it this far.

The main trend to notice in this set of data is how well the 6 seeds do, as well as the winner of the 7v10 game {more on that in the next section though}. Historically 6 seeds have had success in the 2nd round against the winner of the 3v14 game, which by and large boils down to the fact that 6 seeds often upset 3 seeds in the 2nd round. In fact, every single year since 1991 at least one 6 seed makes it to the Sweet Sixteen, and often more than one do- the vast majority of their 2nd round wins come over 3 seeds. If I didn't have any 6 seeds in the Sweet Sixteen I would strongly consider going back and picking at least one of them to make it to the second weekend. The trick, of course, is picking the right one...

Bottom line at this point you don't want your bracket to be too chalky {you have all the top seeds advancing}, nor do you want it to be such an upset-o-rama that it is way outside the norm.

Double Digit Seeds In The Sweet 16
Year Number Seeds
2007 0 n/a
2006 2 11, 13
2005 2 10,12
2004 1 10
2003 2 10,12
2002 3 10,11,12
2001 3 10,11,12
2000 2 10,10
1999 5 10,10,10,12,13
1998 2 10,13
1997 3 10,10,14
1996 1 12
1995 0 n/a
1994 2 10,12
1993 1 12
1992 1 12
1991 3 10,11,12

Look again at your Sweet 16: do you have any double digit seeds {10 seed or lower} in there? Again, last year was an anomaly but even including it, in 15 of the last 17 years at least 1 double digit seed has made it to the 2nd weekend.

Notice the trend here of how well the 10 seeds are represented. What does this mean? It means that if a 10 seed beats a 7 seed and matches up with a 2 seed in the 2nd round, that 2 seed should be on upset alert, because in 10 of the last 17 years at least one such 10 seed has gone through to the Sweet 16. For all the hype that the 12 seeds get in beating the 5 seed in the first round {the media always point out the the "classic 12 over 5 upset"}, the 10 seeds do even better than the 12 seeds in the 2nd round. Also, if you look back up at the data set before this one, and look at all of the 7 and 10 seeds, that tells you that the winner of the 7v10 game has a great shot of getting past the 2 seed and onto the Sweet Sixteen. I always pick what I think the weakest 2 seed is to lose {or if I think a 7/10 winner is especially strong, or if a 2 seed is a team that I hate and really would love to see go out}.

ELITE EIGHT

Seeds Below #3 In Elite Eight
Year Number Seeds
2007 0 n/a
2006 2 4, 11
2005 4 4, 5, 6, 7
2004 2 7, 8
2003 1 7
2002 3 5, 10, 12
2001 2 6, 11
2000 5 5, 6, 7, 8, 8
1999 3 4, 6, 10
1998 1 8
1997 2 6, 10
1996 2 4, 5
1995 2 4, 4
1994 1 9
1993 1 7
1992 3 4, 6, 6
1991 2 4, 10

Many beginners get to this point and have all #1 and #2 seeds. This is starting to sound like a broken record, but last year was an anomaly as it was the first time in 17 years that no team below a 3 seed made the Elite 8. Check your bracket for the number of teams seeded 4 or below in your final eight. You probably should have at least one, and if the data is an accurate predictor, not more than 5. Personally, I would be comfortable with between 1-3.

FINAL FOUR

Seeds Below #2 In Final Four
Year Number Seeds
2007 0 n/a
2006 3 3, 4, 11
2005 2 4, 5
2004 1 3
2003 2 3, 3
2002 1 5
2001 1 3
2000 3 5, 5, 8
1999 1 4
1998 2 3, 3
1997 1 4
1996 2 4, 5
1995 1 4
1994 1 3
1993 0 n/a
1992 2 4, 6
1991 1 3

Once again, last year was rare, and in 15 of the last 17 years, at least one of the Final Four participants has been lower than a 2 seed. Also in 3 of the last 5 years at least 2 of the spots have gone to 3 seeds or lower. I would just make sure that you didn't have zero here {all four spots being either a 1 or a 2 seed} or a four{all 4 slots being 3 seeds or worse}

The Seeds Of The Final Four Participants
Year Seeds Total of Seed #s
2007 1, 1, 2, 2 6
2006 2, 3, 4, 11 20
2005 1, 1, 4, 5 11
2004 1, 2, 2, 3 8
2003 1, 2, 3, 3 9
2002 1, 1, 2, 5 9
2001 1, 1, 2, 3 7
2000 1, 1, 5, 8 15
1999 1, 1, 1, 4 7
1998 1, 2, 3, 3 9
1997 1, 1, 1, 4 7
1996 1, 1, 4, 5 11
1995 1, 2, 2, 4 9
1994 1, 2, 2, 3 8
1993 1, 1, 1, 2 5
1992 1, 2, 4, 6 13
1991 1, 1, 2, 3 7

If you have all four of the #1 seeds in your Final Four, then you are brave, for it has never happened. On the other hand, 2006 was the first time ever that none of the #1 seeds made it, so you never know.

I have seen the final four seeds wonked thusly: add up the seed #s of your final four members. In 15 of the last 17 years the number would be between 6 and 13, with wacky year 2000 having a high number of 15 and 2006's wackiness being an extreme outlier with at total of 20.

THE CHAMPIONSHIP GAME

Seeds Of The Championship Game Participants
Year Seeds
2007 1, 1
2006 2, 3
2005 1, 1
2004 2, 3
2003 2, 3
2002 1, 5
2001 1, 2
2000 1, 5
1999 1, 1
1998 2, 3
1997 1, 4
1996 1, 4
1995 1, 2
1994 1, 2
1993 1, 1
1992 1, 6
1991 2, 3

Basically, the data says that you should have at least one #1 or #2 seed in the title game. If you have a 4 seed against a 6 seed here, then I applaud you for your moxie but the numbers say you won't win your office pool. In fact, then numbers say anything 3 v 3 or lower will not happen even though that sounds pretty feasible.

Seed Of The NCAA Champion
Year Seed
2007 1
2006 3
2005 1
2004 2
2003 3
2002 1
2001 1
2000 1
1999 1
1998 2
1997 4
1996 1
1995 1
1994 1
1993 1
1992 1
1991 2

The outlier here is one of my favorite teams ever, the 1997 Arizona Wildcats, who amazingly beat three different #1 seeds en route to their championship, making them the only team to ever accomplish that feat and establishing a record that can only be equaled but never surpassed. Every other year the champion has been a 1, 2, or 3 seed, with a 1 seed winning it all in 11 of the last 17 years.

So there it is, by the numbers. Again, I am not saying to fill out your brackets to meet these requirements, but to fill out your brackets and then cross check them against this data, which should not be viewed as requirements but as a guideline or baseline or whatever you like.

I like this information because I find that it helps to know basically the number of upsets that likely will happen in each round. The trick, of course, is choosing where to pick your upsets and where to have the favorites advance. That, my friends, is what separates the office pool contributors from the office pool collectors.

Happy Brackets and best of luck. Unless you have Dook winning the national title that is - then I hope that you fail miserably.

8 comments:

Anonymous said...

Dang, man. Was just telling a coworker about this and here it is. I was all prepared to search for it. Anyway, what does the Kanu Final Four look like? I'm going up against C3, so I could use a bit of insight.

Kanu said...

Dude, I have no words of wisdom, as I have seen no less than zero games in their entirety this year. Sad how far I have fallen, really.

Anonymous said...

The fact that your boss doesn't have to find someone to cover you shift is amazing. Never had that luxury. :)

Kanu said...

OMAA-

The Kanu Final Four submitted at the last second is: Louisville, Georgetown, Texas, UCLA, with Texas beating Georgetown in the final.

After dong some serious cramming and catching up in the last week, and talking to a few trusted sources in the know, my current opinion is this: UCLA has the best backcourt in America, as well as the best Big Man in America. There is no reason for them to lose any game, and under the NBA playoff best of 7 rules I'm not sure they could be beat by anyone other than Carolina.

If Carolina beats Louisville, then I think they'll be in the final game, if not champs themselves. I know that would make you very happy.

Anonymous said...

Hey,

History made this weekend. Do you think with what happened last year and this year that the committee is just getting better at seeding the teams? Should they throw in a upset here and there to keep everything interesting? I was really surprised how easily Mem manhandled Tx.

I will probably miss the game Sat Night due to another sporting event. But the plans are to hit Disney on Spring Break starting this weekend and the last time we were there, I missed the heels winning it all!

Kanu said...

No, I don't think it has much to do with the committee- although every year there are arguments over them screwing up some seeds, they do a good job of seeding the best teams in the nation as the 1 seeds.

I think what it means is 1) this year the top 4 teams are just on a higher level than the rest and 2) since the re-organization of the tournament in to this silly ass pod system about 5 years ago, the likelihood of all four 1 seeds making it was greatly increased, as these days at least 2 and often 3 of the 1 seeds get to play in their backyard, which in my opinion has ruined the cinderella aspect of the tournament a bit for me. Take this year- Both Carolina and UCLA, the 2 best teams in the country, not only get 1 seeds {which they earned} but get to play games in their backyard, in UNC's case ALL FOUR GAMES. This is a joke- the rules did not need to be tilted in favor of the higher seeds, but the 1 seeds benefit from the new setup more than most, and from a competitive standpoint the tournament is lessened considerably.

But, it is what it is. At the end of the day, we just had a year when all four of the 1 seeds stepped up and handled their business.

Hopefully next weekend will not be anticlimax.

Anonymous said...

"Both Carolina and UCLA, the 2 best teams in the country, not only get 1 seeds {which they earned} but get to play games in their backyard, in UNC's case ALL FOUR GAMES. This is a joke- the rules did not need to be tilted in favor of the higher seeds, but the 1 seeds benefit from the new setup more than most, and from a competitive standpoint the tournament is lessened considerably."

I'm thinking that is all about money. If they play the East Region in Syracuse and have Carolina as the #1 seed, it would be more difficult to sell it out than it would in Raleigh. I agree that they did benefit from the home court, as did UCLA, but I don't know that it was a huge advantage. I think road trips can solidify a team and make them more of a TEAM. But I think it was Bob Knight who said it best when Pitino was crying about playing them in Charlotte. Bob says, Heels win in Charlotte, NC, Charlotte, Az, Charlotte, anywhere.

Should be some good ball this weekend.

Kanu said...

It was all about money, specifically 1) travel costs the NCAA now has as many teams as close to home as possible, because the NCAA covers the taem's travel cost for the tournament so it is less money out of their pocket this way and 2) ticket sales- the more big teams playing closer to home, the more fans buy tickets.

You are correct, it was {and it} all about that cash for the NCAA. Screw a fair postseason competition played on truly neutral courts- it's all about that cash, homey.