Wednesday June 21, 2006

The Statistics of Deal or No Deal

After having managed to unintentionally avoid yet another pop culture phenomenon for months, I finally saw an episode of Deal or No Deal the other night—my parents had attended a taping and so were visible in the background.  After watching the show, I did what any self-respecting computer geek would have done: I spent about 15 hours over the next few days writing code to figure out the statistics behind the game.  I've distilled what I've learned into this post.

(The following discussion assumes you're familiar with the rules of the US version of the show, which you can read about here.)

While watching the game, I became curious about possible strategies for maximizing winnings.  Much of the game is clearly just random numbers, but I suspected there might be some methods by which players could improve their odds of winning.  In particular, I wondered how likely it would be, if you went into the game with a particular target offer in mind, that you would actually receive an offer equal to or greater than that target.

I wrote a little program (in Perl) to simulate the game.  It starts with an array of 26 numbers containing the monetary values in all the briefcases.  It then randomly removes numbers from the array in the same pattern as in the game (first six, then five, etc.), calculating an offer at each point.  I assumed that the offer would be equal to the expected value of the remainder of the game, which, since all the remaining numbers are equally probable, is equal to the mean of the remaining values.  Note that this assumption is empirically false—the banker's offers on the show vary somewhat from the expected value, but mostly they're in the ballpark.  My program accepts a target value, runs 100,000 iterations of the game keeping track of the highest offer during each, then prints out the percentage of games where the target offer was met or exceeded.  Here are the results for each multiple of $1000 between zero and a million (click for a larger version):

The horizontal axis is the target offer, while the vertical axis is the probability that you'll see an offer at least that high.  If you go in with a target of $100,000 in mind, you'll be in pretty good shape—you'll see an offer of at least $100,000 at some point in 95% of games.  The 50% point in the curve falls at $200,000, so you're as likely as not to see an offer that high during a random game.  The curve falls off rapidly after that.

If you look closely, you'll notice something interesting about the shape of the curve.  It's not continuous, but instead has sharp drop-offs at $150K, $200K, $250K, $300K, $375K, and $500K.  Why should this be?  It's a result of the fact that the distribution of expected values of the board is discrete rather than continuous, especially in the later stages of the game.  To see this, I wrote a program that randomly generates board states with some number of briefcases removed, then calculates the expected value of that board.  You can see the results here:

The horizontal axis is the number of briefcases that have been removed, and the vertical axis shows the generated offers.  In the game, you're only allowed to make decisions with certain numbers of briefcases removed; those data points are blue, while the others are gray.  Each column also has two red dashes showing the maximum and minimum possible offers, which my randomly-generated set of data points did not always include.  Over at the right, where the discreteness of the distribution starts to become obvious, I wanted to make sure to include all the possible offers, so to supplement the data I wrote another program that enumerates all of them.  Data points for those offers are hollow diamonds instead of solid.  Notice that I've included, at the far right, data for a board where 25 (all but one) of the briefcases have been removed.  That's actually the end state of the game, where you keep the last briefcase, so the "offers" in that column are simply the 26 possible monetary values.

In the early stages of the game, there's actually a fairly high floor on the expected value of the remainder of the game.  With six briefcases removed, for example, you should only see offers above $13,420.80—not chump change.  Over the course of the game, the maximum expected offer steadily increases.  It's easy to see why the rules of the game have players removing several briefcases per turn in the early part of the game, since it's pretty much a no-brainer to continue from 12 to 13, for example—the maximum possible offer goes up, and unless you're extraordinarily unlucky, you're not likely to lock yourself into any really low values.

It's near the end of the game that the total number of possible offers becomes clearly discrete.  As mentioned above, with 25 briefcases gone, there are only 26 possible "offers".  With 24 gone, there are still only 325 (26*25/2) possible pairs of monetary values, a few of whose expected offers overlap exactly—for example, {$200,$300} and {$100,$500} both average to $250.  Every simulated game passes through one and only one of these pairs, resulting in the distinct stair-step shape of the first graph.  Each step is associated with the average of two monetary values, which is why we see the steps at $150K (half of $300K) and $375K (half of $750K).  The steps associated with these large values actually result from several overlapping offers in which the large number is averaged with one of the very small ones, exaggerating the corresponding stair-steps.  Notice also that, at the right end of the curve in the first graph, there is a constant probability (about 0.3%) of meeting a target between $750K and $875K, then a drop to a probability of zero thereafter.  That's because there's exactly one pair producing an offer of $875K, {$750K,$1M}, and no pairs that produce an offer higher than that.

The stairsteps still seemed pretty large to me, and I became curious why the last column that's part of the game (24 briefcases gone) was receiving so much weight.  So I wrote yet another program that runs a million random test games, keeping track of the point in the game where the highest offer occurred.  Here are the results:

The maximum offers is pretty likely early on, then the probability drops off slowly over the next few turns of the game, only to sharply increase again near the end.  This helps explain why the discreteness of the distribution of expected offers on the last turn of the game (24 briefcases gone) is so visible in the first graph—the board state with 24 cases gone accounts for nearly a quarter of the maximum offers.  It also suggests why players seem to keep going just one more case at the end of the game, although I suppose it's unlikely they're running these statistics in their heads.

Whew!  That's quite enough geeking out about Deal or No Deal, I think.  As a game, I actually don't find it very interesting to watch—it's just random numbers with a bunch of bells and whistles added to make it seem like there's strategy going on.  That business where the player picks a briefcase but doesn't open it, for example, doesn't actually affect the outcome of the game at all—it might as well be sitting next to a leggy supermodel with the other cases, but it gives Howie something to point at and the player something to clutch nervously.  Having watched the game, enjoyed it on entirely the wrong level, and written some code about it, I doubt I'll ever watch it again.

Finally, I'd be remiss if I didn't steer you towards this SNL sketch, which contains a very funny take on Deal or No Deal.  It points out how odd, complex, and sometimes counterintuitive the rules of the game are—it's good when I pick a low number?—and even includes a little language-related humor:  "No, no, no, it's not a language problem.  I understand every word you're saying, it's just, uh—what is this game?

I am The Tensor, and I approve this post.
01:59 AM in Television | Submit: | Links:

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c88ad53ef00d834cffce969e2

Listed below are links to weblogs that reference The Statistics of Deal or No Deal:

Comments

You are stupendously geeky but I love you anyway.

And you forgot to mention how they totally use canned, coached clips of the audience yelling "Deal!" or "No deal!", which makes sense because the game is THE MOST BORING GAME IN THE WORLD. (Next to baseball. And golf. And maybe tennis.)

Posted by: The Wife at Jun 23, 2006 12:40:29 AM

I don't like sports, so I can say quite accurately, that 'Deal or No Deal' is FAR more boring (and annoying) than any sport. Including golf.

Posted by: Silvercat at Jun 24, 2006 11:45:56 PM

Wow! I felt geeky when I made a spreadsheet that showed the average of the values of the cases left in the game as cases were eliminated (putting an "x" by a number excluded it from the formula computing the average), but you got that beat by a MILE! Of course, I did my calculations on a spreadsheet in about 2 minutes...

Posted by: Daniel at Jul 6, 2006 8:27:44 PM

What I want to know is just how often each of the girls has any specific amount. For instance, my wife swears you should never pick Hailey because she almost always has the million dollar case. So what I want to know are the statistics of how amounts each girl has had. Any geeky way to figure out this one. Has anyone been keeping track?

Posted by: Linc at Oct 30, 2006 5:48:30 PM

Hi, I've been trying to figure this out and am really impressed at your logic.
Is there any way you can send me the program?
By looking at your program, it'll help me with my statistics knowledge.
I really appriciate it.
Oh, and by the way, I'm interested in knowing if your parents told you if the show was totally fake about the way it was presented to home-viewers compared to the way it was shotin the studio.

Thanks again. ed

Posted by: Eddie Pitcock at Nov 2, 2006 9:27:30 PM

Wow. I knew somebody out there had figured it out. Now the question is: which case has had the million the most number of times?

Posted by: at Nov 27, 2006 7:14:45 PM

I am going to be a contestant in the quebec version of deal or no deal so I am looking for all the data I can get.

can you please send me the program too.
you only forget one thing the banker is not giving 100% of the value left. here is the % of each round
1 --- 6%
2 --- 14%
3 --- 36%
4 --- 57%
5 --- 69%
6 --- 83%
7 --- 95%
8 --- 95%
9 --- 98%

for the stats on all the girls and the number go see:
http://www.classicgameshows.com/dealornodealstats.html

Posted by: superfred at Dec 20, 2006 6:37:19 AM

Bankers offer: $100,000
$1
$10
$50
$100
$200
$1,000000

Deal or no deal?

Posted by: Joe at May 14, 2007 8:59:34 AM

The rules seem slightly different from the UK version, where a contestant is always given the chance to switch with the last remaining box. Which, naturally, makes us wonder whether you're better to stick or twist?

Posted by: The Cosh at May 14, 2007 3:16:02 PM

I'm not sure this is the best way of doing the math, as far as the offer goes. The median value of the remaining cases is also a valid choice of offer as you're trying to beat the offer, not just maximize your expected value, so with a median, you have a 50% over/under. My gut would say that if I were the banker, I'd offer something between the median (generally lower, better for risk adverse contestants) and the mean (higher, but more likely to convince risk-taking contestants to accept) depending on the personality of the contestant. Of course, to make it interesting, I think they use something between the 60th percentile and the mean.

Posted by: CW at May 27, 2007 10:15:13 AM

I'd rather watch the Leather Briefcases than the whole show. But I don't know. I fantasize of joining it once in a while.

Posted by: Shawn at Jun 10, 2007 2:23:28 AM

Wow, you've nerded out quite nicely here. Honestly, all these stats provide a very minimal edge considering the fact that this is a one time shot. It's not a repetitive gambling game where you can continuously claw at small edge over and over again. You are on the show once and only once and you need to make the most of it. I think instead of using pure statistics there is something that is much more valuable to use; economic theory. See what I mean here: http://livinginvol.com/?p=105

Posted by: SS at Mar 17, 2010 7:26:45 PM

interesting. you geeked around random draws, that's it. of course the game has nothing to do with random distributions and the offers are never equal to the expected payoffs, that would simply make no sense. Hence, this post has nothing to do with deal or no deal. but it is ok, after all you don;t watch the show.
Still, your post is interesting in that it shows that statistics and probability are not so intuitive after all and we could be pushed to expect certain things to happen or not .. and be surprised.

Posted by: es at Apr 5, 2012 2:10:40 PM