Information travels so quickly, and is so freely circulated in the internet age that I don’t pretend that the following story is reaching anyone for the first time. What was once an amazing example of non-obvious critical reasoning taught in universities, these days may be familiar to students before they reach high school. This example even has a Wikipedia page, entitled the Birthday Paradox. Still, for context, I’ll relay how I first learned of this probability question and why it applies to baseball.
At the start of my sophomore year at Virginia Tech, I walked into my Introductory Statistics course on the first day of classes to find the professor laying out 42 $1 bills across a table at the front of the class. After we’d all found seats he began, “There are forty-two of you in this class. The class list I’ve been given has nothing on it but your name and Social Security Number.* I will offer a bet to each one of you which you may or may not accept. I’ll bet that two of you – and because no one has the same last name, I am assuming there are no twins -- have the same birthday. If you want to take me up on the offer, place a dollar bill and your driver’s license next to one of the dollar bills on the table. ”
(*How much have things changed since the mid-1980s? We used our Social Security Number for everything in college. The first week of college I had to call home to my parents to find out what my Social Security Number was. When people asked me what I learned in college, I’d say “Two things: No means no. And my Social Security Number.” At Tech, test scores would be announced outside a professor’s office with a printout of the class results sorted by SSN. SSNs are issued by region of the country, so if you were from the Northern Virginia/D.C. area your number was buried in the mass of 22x-xx-xxx students. But if you were the one student in the class from Pennsylvania (19x-xx-xxx) or California (began with a high 5 or low 6) not only did everyone know your grade, they knew your SSN!)
The murmurs began. This was stealing from a professor; Why would he give away money? He had to know our birthdays to make that sort of outlandish bet. He continued to assure us he had no idea when we were born. About a third of the class walked to the front and took him up on the bet. He moved those dollars off to the side and was about to start picking up the driver’s licenses when he said, “I’ll tell you what. I’m a risk loving person. For the rest of you, I’ll sweeten the bet even further. I’ll bet you there are at least two pairs of people in this room who have the same birthday. If you don’t think so, same deal. Claim one of the dollar bills with a dollar bill of your own and leave your license.”
Now there was a rush to the front of the room as if he were giving away Duran Duran tickets. The handful of people who didn’t take the bet either didn’t bring any money with them or figured keeping the hideousness of their driver’s license photo under wraps outweighed an extra dollar in their pocket.
As I mentioned above, you’re all probably familiar with this example. The professor had a match on the seventh license he turned over and by the time the entire class had revealed their birthdays we had identified three pairs. As he picked up his winnings, he concluded, “I would have made the bet at three pairs and there was almost an even chance of four matches.” As we all sat amazed he showed us the math behind the problem.
(To summarize: Once you have gathered 23 people at random, the odds are greater than 50% that two of them have the same birthday. If 33 people are present, odds are there are two or more matches. 40 people, three matches and so on. In a class of 42 people there is only a 1 in 12 chance no one has the same birthday. This relationship will always hold as long as a gathering of people is random and births remain evenly distributed across the calendar. Then again, I think we know that in about seven months New Yorkers are going to have an awfully hard time distinguishing between earthquake babies and hurricane babies so maybe birthday’s won’t be so evenly distributed in New York City going forward.)
Why did I mention this? On Friday, during Game 5 of the NLCS, Milwaukee starting pitcher Zach Greinke faced 30 St. Louis Cardinals and didn’t strike out a single one.
I thought the 2011 post-season was going to be Zach Greinke’s coming out party to a national audience. In my preview of the NLCS I gave Milwaukee the edge and called them to narrowly win the series in 7 games. My whole basis for the edge was the expectation of Greinke dominating at least one and maybe two games. Every bit of that analysis was wrong.
To call Greinke even mediocre overstates his effectiveness. Yet, I didn’t come to my conclusion based on, what Gene Hackman’s Captain Ramsey in Crimson Tide might call “personal intuition, gut feelings, hairs on the back of my neck or angels sitting on my shoulder.” I had hard data. Zach Greinke struck out 28% of the hitters he faced this year – a higher percentage than any other pitcher in baseball who faced as many batters as he did. For projection purposes the great thing about strikeout rate is that it stabilizes so quickly. About six games into a season, you have a very reliable indicator of what a pitcher’s strikeout rate will be for the year. This is, of course, not true for any hitting statistic like batting average, home runs per at-bat etc. as hitters’ have a lot more variance in their performance. The extreme variance in Nelson Cruz' post-season performance for the Rangers – 1 for 15 with no walks and four strikeouts against Tampa followed by 8 for 22 with 6 home runs and 2 doubles against Detroit – is not unusual during any two week stretch of a baseball season.
For example, Greinke’s monthly strikeout rates ran from 24% to 34% during the season, very little deviance from his overall rate of 28%. So, applying the logic from the birthday paradox above, how unlikely is it that a pitcher who strikes out 28% (28.1% to be precise) of the batters he faces could go 30 in a row without striking one out? The answer is .005% or 1 in 20,000. (That’s the same chance of having a room full of 61 people without the same birthday, incidentally.) To the best of my ability I went through his logs for the year and could only find one stretch greater than 12 batters in which he didn't strike someone out – an 18 batter stretch that extended over three games.
In no way am I implying St. Louis was lucky to win that series – they left no doubt they were the better team. In the post-season they’ve shown they can win 1-0 or 12-6 which makes them dangerous. But I think it does show that something was wrong with Zach Greinke. Watching Game 5 and to a lesser degree Game 1, the best pitcher in baseball at striking out hitters in 2011 couldn’t miss bats. To my eye, the only strikes he was getting were on foul balls. I contend something was wrong with Zach Greinke and while I’d love to blame it on Milwaukee for starting him two times in a row on three days rest in the last three weeks, I suspect by the time Spring Training rolls around we’re going to hear about an injury. On Wednesday, I'll preview the World Series.