About 250 turns, apparently.

Motivation¶

My friend Scott and I sat down at our weekly Trivia venue about 20 minutes before showtime. "Hershel's said he's gonna be a bit late, he got held up with something." No bother. Busy job, these things happen.

We hurriedly ordered a couple beers apiece while they were still on Happy Hour and spent the next while eyeing the front window, waiting for our third finally wander in.

For my money, the hardest question of the night is always "What the hell is our team name gonna be??" We waffle for almost literally the next 20 minutes and finally settle on Snape Kills Dumbledore on Page 596 (one of our better ones, tbh). And like clockwork, in walks Hershel right as we're turning in our registration. "Hey, sorry. Got caught up in a really exciting game of War."

And I about short-circuit at the unintentional oxymoron.

The "Game"¶

For those of you who've never been a child with a deck of cards before, War is a pasttime that basically looks like the following:

Split a deck across two players
Players blindly play the top card of their decks
- Higher card takes both cards
If the cards match, then each discard two cards, then play a third. This is called a War.
- This goes on until one is higher than the other, otherwise, repeat the War step
When a player runs out of cards to draw from, they shuffle their discards to create a new draw pile
Repeat, ad nauseum, until the game is over

And that's it. There's no strategy. No choice. You just go back and forth and back and forth until the game just sort of... ends.

The outcome of the game is decided as soon as you shuffle both decks and set them in front of the players. You could literally determine who wins from the outset, obviating the need to even go through the motions. It's a complete and total waste of time, especially when there's trivia to be played. Honestly, it wouldn't be that hard to build a simulator to do th-- Wait a minute.

And so I spent a good chunk of my free time over the next couple weeks pettily doing just that

The Data¶

I've mostly been doing ETLs and model devlopment in PySpark the past few months, so this felt like a good an excuse as any to practice some Object Oriented Design in pure Python.

And that went well enough. Until I started hitting bugs and edge cases I hadn't considered. So if you're checking out the codebase, dive into war.Game.run_turn() at your own peril. Turns out neatly abstracting state and interdependencies gets tricky fast, haha

Ultimately, the workflow I built for this project meant firing off main.py in the project link above. In this file, I specified how many games I wanted to simulate and it would go through, run them, and saving the game state for every turn, until I had a whole mess of files called data0.txt, data1.txt, ...

Per Turn¶

Later, I load those text files into neat tables that look like the following (abbreviated to the first 10 turns of the first game, here).

In [2]:

whole_games = load_whole_games(1)

whole_games.head(10)

Out[2]:

	num_a	num_b	num_aces_a	num_aces_b	num_kings_a	num_kings_b	wars
0	26	26	2	2	1	3	0
1	27	25	2	2	1	3	0
2	28	24	2	2	1	3	1
3	32	20	2	2	2	2	0
4	31	21	2	2	2	2	0
5	32	20	2	2	2	2	0
6	31	21	2	2	2	2	0
7	30	22	2	2	2	2	0
8	29	23	2	2	2	2	0
9	30	22	2	2	2	2	0

Looking across the top, you'll see that the attributes I captured per turn were:

The number of cards that Player A and Player B have (deck and discard combined)
The number of aces and kings each player has (more on this later)
How many times the players went to War that turn
An index of which game I'm looking at

And if I load ten thousand of these files, it's a pretty big table

In [3]:

whole_games = load_whole_games(10000)

whole_games.shape

Out[3]:

(3183346, 8)

Of course, I ran ten times that amount for this post. Which just means that I've got almost a gigabyte of text files just taking up space on my computer.

Per Game¶

Additionally, I built a parser that will go through and grab the first and last rows of each game file. It uses this to look at starting and end conditions so it can summarize each game.

In [4]:

results = get_game_summaries()

len(results)

Out[4]:

Data fields include:

Which game I'm looking at
How many aces and kings Player A started with (more on this later)
If Player A won the game
If Player A won the first round (both players exhausting their first 26 card stack)

In [5]:

results.head(25)

Out[5]:

	game	a_starting_aces	a_starting_kings	a_won	a_won_first_round
0	0	2	1	True	False
1	1	0	2	False	False
2	10	2	3	True	True
3	100	2	1	True	True
4	1000	0	1	False	False
5	10000	1	3	False	False
6	10001	1	3	True	False
7	10002	4	2	True	True
8	10003	3	2	True	False
9	10004	1	1	False	False
10	10005	1	2	False	False
11	10006	2	3	False	True
12	10007	3	2	True	True
13	10008	3	2	True	True
14	10009	3	3	True	True
15	1001	2	1	False	True
16	10010	4	3	True	True
17	10011	3	3	False	True
18	10012	2	4	False	True
19	10013	2	2	False	True
20	10014	2	3	True	True
21	10015	1	2	False	False
22	10016	4	0	True	True
23	10017	3	2	True	True
24	10018	3	4	True	False

The Art of War¶

As soon as I had a simulator cooked up that would correctly run and resolve games, I started sketching out visualizations that I'd want to make. From there, I had a good idea of what data I would want to capture during the simulations and doubled back into my code and wrote a bunch of logging methods.

Wins and Losses¶

If we marry the every-turn dataset to the every-game dataset, we can make neat plots that layer multiple games on top of one another.

In [6]:

games_and_results = whole_games.merge(results, on='game')

Here, I plot the first 100 games that I simulated. As you can see, there's a pretty even distribution between wins and losses, and most of the games resolve within the first few hundred turns.

In [7]:

plot_wins_vs_losses(games_and_results, num_games=100, linealpha=.2);

Now, let's look the first thousand games.

One thing I want to point out is the last argument in my plotting call, linealpha=.1. This essentially means that every line that gets plotted on the figure is about 90% see-though.

So when the first ~500 turns are basically a mess of solidly-colored red and green you're seeing the result of many, many overlapping games and outcomes.

Furthermore, if you notice the x-axis difference between this and the last post, we've stumbled across games that go on for 2,000+ turns, which is just bananas.

In [8]:

plot_wins_vs_losses(games_and_results, num_games=1000, linealpha=.1);

And then ten thousand games, because why not?

In [9]:

plot_wins_vs_losses(games_and_results, 10000, .05, .1);

All told, the win rate between players A and B were about even, as one might expect

In [10]:

results['a_won'].value_counts()

Out[10]:

False    50168
True     49832
Name: a_won, dtype: int64

(Editor's note: I was excited to be done building the simulator and sat down to write this post over a week ago, only to find that I had closer to a 70/30 win loss ratio. The finish line is never an ideal time to learn that your code was wrong >_>)

Estimated Playtime¶

You may have been shocked to see how long some of the games from the last section dragged out. I sure was.

If each game was represented by its own text file, and each turn a line in said files, finding the duration of any game was just a matter of doing a line count. This function does just that.

In [11]:

lengths = get_game_lengths()

The distribution is pretty skewed by outliars, but it looks like you can bank on most of your games being less than 500 turns.

In [12]:

lengths['turns'].hist(bins=100, figsize=(18, 10));

Upon closer inspection, half of your games will come in at 241 or less.

In [13]:

lengths['turns'].describe()

Out[13]:

count    100000.000000
mean        320.643210
std         264.450079
min          13.000000
25%         136.000000
50%         241.000000
75%         423.000000
max        3614.000000
Name: turns, dtype: float64

But what does that mean?

Let's assume that, ignoring the additional time it takes to resolve a war, it takes 2 seconds to run a turn (I think it'd probably take longer, but hey, round numbers...)

That means that you're spending, on average, about 8 minutes just going through the motions.

In [14]:

(241  # average game length
 * 2  # seconds per turn
 / 60 # seconds a minute
)

Out[14]:

8.033333333333333

Extrapolating, say you only had 15 minutes and so you were stuck between the decision of playing a game of War or doing literally anything else. You'd need the game to wrap up in less than 450 turns.

In [15]:

(15    # desired minutes
 * 60  # seconds a minute
 / 2   # seconds a turn
)

Out[15]:

450.0

Or everything to the left of the red line.

In [16]:

lengths['turns'].hist(bins=100, figsize=(18, 10)).axvline(450, color='r');

A little shy of 80% of the games that you play

In [17]:

len(lengths[lengths['turns'] <= 450]) / len(lengths)

Out[17]:

0.77498

By the by, BoardGameGeek has this game coming in at about a 30 minute playtime.

War Begets More War¶

One of the more interesting interactions I found in the data was the relationship between the number of times players went to war and the number of turns the game stretched out.

No headscratching whatsoever that there was a positive relationship between the two, but I didn't expect to see such a steady, linear relationship.

In [18]:

temp = (whole_games.groupby('game')['wars'].sum()
        .to_frame().merge(lengths, left_index=True, right_on='game'))
 
fig, ax = plt.subplots(figsize=(18, 10))
ax.scatter(temp['turns'], temp['wars'], alpha=.5)
ax.set_xlabel('Number of Turns', fontsize=16)
ax.set_ylabel('Number of Wars', fontsize=16);

I was pretty blown away the correlation coefficient, if anyone cares to see it

In [19]:

temp.corr()['turns']['wars']

Out[19]:

0.9613902400584097

As I mentioned above, each number in the wars column represents the number of times the players went to war on that turn. And so filtering out all of the normal, non-war turns, we can see the distribution of back-to-back wars on turns that had any at all.

In [20]:

a = whole_games[whole_games['wars'] != 0]['wars']
a.value_counts(normalize=True)

Out[20]:

1    0.938778
2    0.057549
3    0.003455
4    0.000197
5    0.000016
6    0.000005
Name: wars, dtype: float64

Mostly 1's, a couple 2's. Nothing too surprising.

Similarly, we can see the highest number of consecutive wars in each game

In [21]:

whole_games.groupby('game')['wars'].max().value_counts().sort_index()

Out[21]:

0       1
1    3860
2    5476
3     622
4      37
5       3
6       1
Name: wars, dtype: int64

Wait, there was a turn with six consecutive wars??

In [22]:

whole_games[whole_games['wars'] == 6]

Out[22]:

	num_a	num_b	num_aces_a	num_aces_b	num_kings_a	num_kings_b	wars	game
1960806	33	19	4	0	3	1	6	15505

Looks like it occurs at index 1960806 in our big table. Let's look a few turns before and after that.

In [23]:

whole_games.loc[1960804:1960810]

Out[23]:

	num_a	num_b	num_aces_a	num_kings_a	num_kings_b	wars	game
1960804	30	22	4	3	1	1	15505
1960805	34	18	4	3	1	0	15505
1960806	33	19	4	3	1	6	15505
1960807	26	26	4	2	2	0	15506
1960808	27	25	4	2	2	0	15506
1960809	26	26	4	2	2	0	15506
1960810	27	25	4	2	2	0	15506

Oh, man. The game column increments in the very next row. These 6 wars were what closed out the game.

In [24]:

ax = plot_game_history(15118)

Lol, get rekt

When to Pack it Up?¶

To the extent that the answer isn't "right away, always" my main goal in writing this post is to develop some intuition for when you'd be better off, with some degree of certainty, putting the cards away and doing something else

As we saw above, I observed about a 50/50 win rate per player across the hundred-thousand games I simulated. But suppose you were paying attention the the cards that you saw, what inferences could you make about how the game would play out?

Win the Battle, Win the War?¶

One of the first things that was brought to my attention (and trickiest to code...) was logging whether or not Player A had more cards than Player B by the time both players reached the bottom of their 26 cards and shuffled for the first time.

Like the "A won the whole game" ratio, there was a pretty even distribution of values here as well.

In [25]:

results.groupby(['a_won_first_round']).size()

Out[25]:

a_won_first_round
False    50549
True     49451
dtype: int64

Makes enough sense that that'd be even. Might also be easy to accept that winning the first battles means you're more likely to win the war.

In [26]:

results.groupby(['a_won', 'a_won_first_round']).size().unstack()

Out[26]:

a_won_first_round	False	True
a_won
False	30898	19270
True	19651	30181

In our hundred thousand games, we saw A's chance of winning increase from a coin flip to almost 2/3 when they won the first round.

In [27]:

results.groupby('a_won_first_round')['a_won'].mean()

Out[27]:

a_won_first_round
False    0.388752
True     0.610321
Name: a_won, dtype: float64

Stack the Deck¶

More compelling yet, was looking at the games through the lens of "How many Aces did each player start with?"

I loaded up ALL of the games I'd simulated, then trimmed that dataset down to just games where Player A started a game with 4 aces in hand.

In [28]:

whole_games = load_whole_games()
temp = whole_games.merge(results, on='game')
four_ace_game = temp[(temp['a_starting_aces'] == 4)]

This wound up being about five and a half thousand

In [29]:

four_ace_game['game'].nunique()

Out[29]:

Using the same plots as above, there's a clear difference looking at these games. Not only is there a stark imbalance between Wins and Losses, the games are much shorter.

In [30]:

plot_wins_vs_losses(four_ace_game, 1000, linealpha=.1,
                    markeralpha=.1, xlim=[0, 1000],
                    additional_title='when Player A Starts with 4 Aces');

But this doesn't quite tell the whole story.

Instead, we'll plot every 4-Aces-for-Player-A game and change the line transparency from 90%, to 99%.

In [31]:

plot_wins_vs_losses(four_ace_game, 5570, linealpha=.01,
                    markeralpha=.01, xlim=[0, 1000],
                    additional_title='when Player A Starts with 4 Aces');

And it should be immediately obvious just how much shorter these games tend to be from the color density alone.

In fact, the length of the median game is almost 100 turns shorter than your run-of-the-mill starting condition.

In [57]:

game_ids = four_ace_game['game'].unique()
lengths[lengths['game'].isin(game_ids)]['turns'].describe()

Out[57]:

count    5570.000000
mean      241.390844
std       235.336101
min        16.000000
25%        91.250000
50%       155.000000
75%       306.000000
max      2716.000000
Name: turns, dtype: float64

Count Your Cards¶

So we've seen that the number of Aces in your deck is strongly predictive of how your game's about to play out. But just how much?

Revisiting our entire dataset of 100k games, we can see that starting with 4 aces confers an 83% win-rate.

In [33]:

results.groupby('a_starting_aces')['a_won'].mean()

Out[33]:

a_starting_aces
0    0.174014
1    0.333705
2    0.497098
3    0.665402
4    0.831059
Name: a_won, dtype: float64

a statement we can make with a healthy number of observations to back that up.

In [34]:

results.groupby('a_starting_aces').size()

Out[34]:

a_starting_aces
0     5580
1    25127
2    38944
3    24779
4     5570
dtype: int64

What's more is if you also track the number of Kings that Player A starts with, your certainty only increases.

In [35]:

gb = results.groupby(['a_starting_aces', 'a_starting_kings'])['a_won']

In [36]:

wins_by_starts = gb.mean().unstack()
wins_by_starts

Out[36]:

a_starting_kings	0	1	2	3	4
a_starting_aces
0	0.054795	0.116860	0.169111	0.214628	0.258974
1	0.232432	0.276803	0.334432	0.377011	0.414605
2	0.382646	0.453333	0.498350	0.544873	0.589559
3	0.576875	0.626438	0.670185	0.712605	0.745599
4	0.782407	0.793765	0.843165	0.866372	0.916667

Again, a reasonable amount of data for each king/ace count pair

In [37]:

counts_by_starts = gb.count().unstack()
counts_by_starts

Out[37]:

a_starting_kings	0	1	2	3	4
a_starting_aces
0	219	1121	2182	1668	390
1	1110	5755	9706	6899	1657
2	2109	9750	15459	9538	2088
3	1600	6778	9569	5696	1136
4	432	1668	2136	1130	204

Or to distill this entire section into one simple heatmap.

In [38]:

starting_cards_heatmap(wins_by_starts);

And this is where I had originally intended to end the post.

But then I got to thinking, "A 91% win rate with 4 Aces and 4 Kings has gotta mean a 9% lose rate, right?" and I went a'hunting.

High/Low-Lights¶

One last pass through our aggregated data, I want to look for games that we'd find in either extreme of the heatmap above

In [39]:

results = results.merge(lengths)

The Thrill of Victory¶

First, I started by looking up games where Player A was a clear underdog, but came back and won it

In [40]:

interesting_wins = results[(results['a_starting_aces'] == 0)
                           & (results['a_starting_kings'] == 0)
                           & (results['a_won'] == True)]
interesting_wins

Out[40]:

	game	a_won	a_won_first_round	turns
7289	16558	True	False	324
21014	28910	True	False	384
22282	30050	True	False	210
32896	39604	True	False	975
46834	52148	True	False	269
48700	53828	True	False	721
53270	57941	True	False	399
59993	63992	True	False	765
67711	70938	True	False	240
69862	72874	True	False	339
72742	75466	True	False	338
98334	985	True	False	544

52148¶

Picking a relatively-quick game from the pile, we can see a bit of back and forth as Player A claws their way to victory. At a first blush, it might seem that they were at risk of losing after the War on turn ~145.

In [41]:

ax = plot_game_history(52148)

But, when you layer in how many Aces and Kings that Player A has per turn, despite losing some cards in that big War drop, the steady flow of 3 Aces and 3 Kings allows them to gradually clean up Player B's stack

In [42]:

ax = plot_game_history(52148, aces_and_kings=True)

39604¶

Now fishing for the longest of the bunch, we've got a game that legitimately has some plot twists.

In [43]:

plot_game_history(39604);

Overlaying Aces and Kings once more, I'm inclined to think that a low Ace-Count variance does a plenty-good job insuring against the back and forth of throwaway cards. Especially after Player A nabs all of the Aces and Kings in those late-game wars.

In [44]:

plot_game_history(39604, aces_and_kings=True);

The Agony of Defeat¶

On the other hand, we can look for the exact opposite: games where Player A was set up to win and then blew it

In [45]:

interesting_losses = results[(results['a_starting_aces'] == 4)
                             & (results['a_starting_kings'] == 4)
                             & (results['a_won'] == False)]
interesting_losses.merge(lengths)

Out[45]:

	game	a_starting_aces	a_starting_kings	a_won	a_won_first_round	turns
0	15207	4	4	False	True	689
1	26007	4	4	False	False	243
2	28449	4	4	False	True	732
3	31479	4	4	False	True	949
4	33303	4	4	False	True	478
5	33521	4	4	False	True	394
6	34701	4	4	False	True	310
7	40859	4	4	False	True	403
8	41398	4	4	False	True	733
9	44944	4	4	False	True	487
10	49421	4	4	False	True	149
11	50773	4	4	False	True	184
12	60654	4	4	False	True	474
13	625	4	4	False	True	479
14	78832	4	4	False	False	235
15	79214	4	4	False	False	456
16	980	4	4	False	True	292

980¶

This one was a particularly messy blunder. Look at that card loss at turn 40.

In [46]:

plot_game_history(980);

More importantly, a 3 Ace Swing from a single War that they never recovered from. Sucks to suck, lol

In [47]:

plot_game_history(980, aces_and_kings=True);

31479¶

Finally, if the game as a whole is called War, then this particular match was a straight-up siege.

In [48]:

plot_game_history(31479);

Player B just waited them out, pilfering high-card after high-card until the game ended on one last War.

In [58]:

plot_game_history(31479, aces_and_kings=True);

Conclusion¶

Okay, so maybe I came out too hard on War.

As much as I enjoyed all of the work that went into this post, the funnest part for me was putting together that last section. Finding a way to cleanly visualize the development of a game, while keeping track of the most important elements involved a good amount of matplotlib finagling, and was kind of a blast to play around with when I'd find screwy results in my summary tables.

However, counterpoint-- this kind of thing happens less than 0.03% of the time.

In [49]:

(len(interesting_losses) + len(interesting_wins)) / 100000

Out[49]:

0.00029

So in summary: Damn it, Hershel

	num_a	num_b	num_aces_a	num_aces_b	num_kings_a	num_kings_b	wars
0	26	26	2	2	1	3	0
1	27	25	2	2	1	3	0
2	28	24	2	2	1	3	1
3	32	20	2	2	2	2	0
4	31	21	2	2	2	2	0
5	32	20	2	2	2	2	0
6	31	21	2	2	2	2	0
7	30	22	2	2	2	2	0
8	29	23	2	2	2	2	0
9	30	22	2	2	2	2	0

	num_a	num_b	num_aces_a	num_aces_b	num_kings_a	num_kings_b	wars
0	26	26	2	2	1	3	0
1	27	25	2	2	1	3	0
2	28	24	2	2	1	3	1
3	32	20	2	2	2	2	0
4	31	21	2	2	2	2	0
5	32	20	2	2	2	2	0
6	31	21	2	2	2	2	0
7	30	22	2	2	2	2	0
8	29	23	2	2	2	2	0
9	30	22	2	2	2	2	0

	num_a	num_b	num_aces_a	num_aces_b	num_kings_a	num_kings_b	wars
0	26	26	2	2	1	3	0
1	27	25	2	2	1	3	0
2	28	24	2	2	1	3	1
3	32	20	2	2	2	2	0
4	31	21	2	2	2	2	0
5	32	20	2	2	2	2	0
6	31	21	2	2	2	2	0
7	30	22	2	2	2	2	0
8	29	23	2	2	2	2	0
9	30	22	2	2	2	2	0