On Birthdays and counting to 253
April 19, 2017
There is a well known math problem, that is supposed to shake our intuition about numbers, known as the birthday problem. The premise is how many people would there need to be in a room to have a 50% chance of at least 2 people having the same birthday. Before we move ahead a similar but different question is how many other people would have to be in a room with you to have a 50% chance someone has your birthday. Usually if you get someone to answer the former question they say hundreds which is the answer to the second question but not the first. Most will answer the wrong question because they are thinking in terms of people but not comparisons.
To answer this problem we have to think in terms of multiplying failure rates. The probability that at least someone shares a birthday is the compliment of the probability no one shares a birthday and that is something we can solve for. After you crunch the numbers it turns out with only 23 people there is a 50% chance of 2 people having the same birthday. This may seem like very few people but what does that really mean? For there to be no duplicates each new person entering the room needs to ask each person already in the room what their birthday is and look for a match. With only 23 people there are 253 such comparisons. How many other people would it take for there to be a 50% chance someone else has your birthday? Why 253 of course. The same number of comparisons because in this case each new person entering the room just has to ask you your birthday and look for a match.
So in my estimation hundreds is a reasonable initial guess as long as hundreds is low hundreds and not high. There is a limit on what is a reasonable answer known as the pigeon hole principle. Any answer above 365 is not reasonable. Even if the first 365 people all had different birthdays the 366th person would have to share a birthday with someone.
It should be noted this is all assuming a 365 day year, and that all birthday appear at the same frequency throughout the population. This later assumption I believe is conservative in a sense. Any deviation from this will mean these more common birthday will be more highly represented and lead to a match sooner. I have written a short test program in Go that implements a solution to this problem.
https://github.com/piggybankgames/BirthdayProblem