All the Riddler Source Code

PostAllSource2

So I’ve gotten sick of fighting with WordPress in each post trying to nicely format my C++ code when I post, so I decided to start posting it to textuploader.com and providing links to the formatted and raw files. This will allow you to easily follow the link and copy-pasta the code if you want.

Here are links to previous Riddler posts and source code:

Secret Election: Post. Source.

River Crossing: Post. Source.

Neurotic Shooter: Post. Source.

Sadistic Salesman: Post. Source.

From now on, I’ll just take a picture of the relevant code for a post and post that, with a link to the raw code. Or something. It’s a work in progress.

Tweaking the Secret Election code

While waiting to see the results of this week’s Riddler, I started thinking about my solution to the problem. In my last post, I outlined two ways of solving the problem: one by hand on paper, and one using C++ and recursion with memoization.

If you look at the way I solved the puzzle by hand, you might recognize it as a form of dynamic programming (my coworker thinks I’m obsessed with finding ways to use dynamic programming; mea culpa). So I decided to write a version of the SimulateElection function using it instead of recursion. Here it is:

// initialize winner array with candidates
char winner[CANDIDATE_COUNT+1] = "";
for(int i = 0; i < CANDIDATE_COUNT; i++)
winner[i] = 'A' + i;

for(int cand = CANDIDATE_COUNT - 1; cand > 0; cand--)
{
// now simulate election for cand vs. winners in each 'row'
for(int i = 0; i < cand; i++)
winner[i] = RawVote(winner[i], winner[cand]);
}

Pretty short! The key here is to start our array of winners with the candidates in the order in which their secret elections happen, so “ABCDE”. Then in each round, starting from the last election, we hold a raw election between the current value we are evaluating, and the values in each of the preceding indices. So:

“ABCDE” – Starting values. Now we pit E vs. first four values, A, B, C, and D.

“ABEDE” – Values after round 1, because E defeated C, but lost to A, B, and D. Now pit D vs. values in first 3 spots, A, B, and E.

“DDDDE” – Values after round 2, because D defeated A, B, and E. Now pit 3rd value, D, vs. first two spots, D and D.

“DDDDE” – Values after round 3, Now pit 2nd value, D, vs. 1st value A.

“DDDDE” – We are done. Our first value, D, is the winner.

And this is how it goes when we allow candidate A to select candidate E as a proxy:

“ABCDE” – Starting values.

“AEEEE” – After round 1. Candidate E defeated everyone but A. Rounds 2 and 3 will pit E vs. E. The final round will pit A vs. E again, and A will win.

I think it’s easier to look at the recursive code and understand what is going on conceptually. Sometimes it’s hard to see how dynamic programming is “working” when you examine the code, but that might be my own bias.

I did some performance testing on the dynamic programming vs. the recursion. I also made a version of SimulateElection that used recursion but NOT memoization, meaning that all recursive branches were visited. The complexity for the recursive algorithm is n-factorial with n storage, but using memoization reduces it to n-squared with n-squared storage. Using dynamic programming is still n-squared time, but only n storage. Although dynamic programming and memoization are both n-squared time, there’s a little less overhead in the dynamic programming. Here are some time trial results for all three methods:

Elapsed ticks 100000 trials using recursion and memoization = 355 ms.
Elapsed ticks 100000 trials using recursion = 476 ms.
Elapsed ticks 100000 trials using dynamic programming = 283 ms.

All of them are pretty fast because the data is so small. I had to run 100,000 trials just to get decent numbers to compare.

Full source code here.

The 538 Riddler: Secret Election

Well, I’m 3-for-3 so far in finding answers to the 538 Riddler and for last week’s answer I got another shoutout from Ollie. This week’s Riddler concerns a secret cabal of politicians who decide to bypass that pesky democratic process and decide among themselves who will become president.

Suppose that five politicians, disgusted with the current two-party system, come together to choose a third-party candidate to run in the 2016 presidential election. The politicians’ names are Anders (A), Blinton (B), Cubio (C), Drump (D) and Eruz (E). Not wanting to spend all their time campaigning in Iowa and New Hampshire in winter, they decide instead to pick which of them will be the candidate at a secret meeting with just the five of them. The voting procedure is as follows: They will first hold a vote of A versus B. (The five politicians are the only voters.) The winner of that vote will then be paired against C. That winner will be paired against D, and finally that winner will be paired against E. They will declare the winner of that last matchup to be their candidate.

Each of A, B, C, D and E wants to be the presidential candidate themselves, but also has clear preferences over the others. Furthermore, the politicians’ preferences are common knowledge. Their preferences are as follows (“X > Y” means Candidate X is preferred to Candidate Y):

Candidate A: A > B > C > D > E

Candidate B: B > A > E > D > C

Candidate C: C > D > A > E > B

Candidate D: D > B > A > E > C

Candidate E: E > D > B > C > A

All of the politicians are forward-looking and vote strategically.

Question 1: Who will be chosen as the presidential candidate?

So I was able to work this one out on paper, but I decided to write a program to confirm my answer, which I will show below.

 

So, first let’s look at how I figured out “by hand” the winner of the election. The key here is to look at the last election first, when candidate Eruz will face off against one of the four winners of the previous elections. Using the candidates stated preferences we find:

Aanders vs. Eruz – Aanders would win 4-1.

Blinton vs. Eruz – Blinton would win 3-2.

Cubio vs. Eruz – Eruz would win 3-2.

Drump vs. Eruz – Drump would win 3-2.

Poor candidate Cubio! He literally cannot win no matter what happens with the current voting schedule. What we also see is that any other candidate who survives to the last election vs. Eruz will win. Since all the preferences are known, each of the candidates knows this. Now let’s take a step back and look at the previous election. One of the first three candidates (Aanders, Blinton, or Cubio) will be facing off against candidate Drump.

If these were the last elections, we would see that Drump would beat all three of the preceding candidates. However since we know that Eruz waits to take on the winner, each of these elections becomes a decision between the winner and Eruz:

Aanders vs. Drump = (Aanders vs. Eruz) vs. (Drump vs. Eruz) = (Aanders vs. Drump)

Blinton vs. Drump = (Blinton vs. Eruz) vs. (Drump vs. Eruz) = (Blinton vs. Drump)

Cubio vs. Drump = (Cubio vs. Eruz) vs. (Drump vs. Eruz) = (Eruz vs. Drump)

In all three of those matchups, Drump wins, and so according to the rules setup by the cabal, candidate Drump will win the secret election. It’s interesting to note that even if Cubio were favored in a head-to-head matchup with Drump in the 3rd round, game theory would dictate that the other candidates would still vote for Drump since they know Cubio would be destined to lose to Eruz in the final round, and they prefer Drump over Eruz.

Here’s my “scratch pad” answer to the first question:

Normal election:
AvE = A
BvE = B
CvE = E
DvE = D
so...
AvD = (AvE)v(DvE) = AvD = D
BvD = (BvE)v(DvE) = BvD = D
CvD = (CvE)v(DvE) = EvD = D
D wins the election.

Let’s look at the remaining questions from 538:

Now assume that A has the flu and is forced to miss the voting meeting. He is allowed to transfer his vote to someone else, but he can’t make that other person commit to vote against her own self-interest.

Question 2: To whom should he transfer his vote, given his candidate preference outlined above (A > B > C > D > E)?

Question 3: Who will win the candidacy now?

Question 4: A month before the meeting, Candidate A must decide whether or not to get the flu vaccine. Should he get it?

Okay, so now we have a wrinkle where candidate Aanders has the opportunity to select another candidate as a proxy. I should say that my answer to these questions about who Aanders should choose as a proxy, and whether Aanders should purposely miss the vote, depend on the idea that all candidates will know who Aanders has chosen as a proxy, and who that person will vote for. If that’s not the case, then I think it could get pretty tricky.

Here are the four different scenarios for elections where Aanders misses the vote, and chooses Blinton, Cubio, Drump, and Eruz as proxies:

A selects B as a proxy:
AvE = A
BvE = B
CvE = E
DvE = E
so...
AvD = (AvE)v(DvE) = AvE = A
BvD = (BvE)v(DvE) = BvE = B
CvD = (CvE)v(DvE) = EvE = E
so...
AvC = (AvD)v(CvD) = AvE = A
BvC = (BvD)v(CvD) = BvE = B
so...
AvB = (AvC)v(BvC) = AvB = B
B wins the election.

_______________________________________
A selects C as a proxy:
AvE = A
BvE = E
CvE = E
DvE = D
so...
AvD = (AvE)v(DvE) = AvD = D
BvD = (BvE)v(DvE) = EvD = D
CvD = (CvE)v(DvE) = EvD = D
D wins the election.

_______________________________________
A selects D as a proxy:
AvE = A
BvE = B
CvE = E
DvE = D
so...
AvD = (AvE)v(DvE) = AvD = D
BvD = (BvE)v(DvE) = BvD = D
CvD = (CvE)v(DvE) = EvD = D
D wins the election.

_______________________________________
A selects E as a proxy:
AvE = A
BvE = E
CvE = E
DvE = E
so...
AvD = (AvE)v(DvE) = AvE = A
BvD = (BvE)v(DvE) = EvE = E
CvD = (CvE)v(DvE) = EvE = E
so...
AvC = (AvD)v(CvD) = AvE = A
BvC = (BvD)v(CvD) = EvE = E
so...
AvB = (AvC)v(BvC) = AvE = A
A wins the election!

So if Aanders chooses Blinton, Blinton wins. If Aanders chooses Cubio or Drump, then Drump wins. (Poor Cubio can’t even win when controlling two votes!) But Aanders can win the election by choosing his most ideologically opposed candidate as a proxy. Weird! So the answer to question 4 is, yes of course get the flu vaccine (vaccines are good), but pretend to be sick anyway and name candidate Eruz as your proxy, Mr. Aanders.

As I said, I was able to do this on paper by hand, but I decided to write a program to do it anyway. I set up a C++ class called Candidate which is initialized with a list of other candidate preferences in the form of a 5-character string like “ABCDE” for Aanders. Those preferences are used to give a score to each candidate and then the Candidate class can be queried for its preference between two candidates.

It’s possible to designate a proxy candidate which will be queried instead by calling SetProxy. Here’s Candidate:

#define CANDIDATE_COUNT 5

class Candidate
{
char m_id;
int m_scores[CANDIDATE_COUNT];
const Candidate *m_proxy;

public:
Candidate(const char *rankings) : m_proxy(NULL)
{
m_id = rankings[0];
for(int cand = 0; cand < CANDIDATE_COUNT; cand++)
{
m_scores[rankings[cand] - 'A'] = CANDIDATE_COUNT - cand;
}
}
// allow another candidate to vote for me
void SetProxy(const Candidate *proxy) {m_proxy = proxy;}

bool RawPreference(char a, char b) const
{
// if a proxy has been designated, ask it
if(m_proxy != NULL && m_proxy != this)
return m_proxy->RawPreference(a, b);

// otherwise return true if our score for the first candidate is
// greater than the second's score
if(m_scores[a - 'A'] > m_scores[b - 'A'])
return true;
return false;
}
};

Now we need a class to represent the election called, well, Election. Election creates 5 candidates with their respective preferences. A function called SimulateElectionWithAbsenteeAndProxy allows the election to be run, and if values are set for the absentee and proxy parameters, any candidate can be set as absent with any other candidate set as its proxy. (A function called SimulateElection calls it with absentee and proxy set to 0 to signify no absentees.) Here’s Election:

class Election
{
public:
Election()
{
const char *c_rankings[] = {
"ABCDE", "BAEDC", "CDAEB", "DBAEC", "EDBCA"
};

for(int i = 0; i < CANDIDATE_COUNT; i++)
{
m_candidates[i] = new Candidate(c_rankings[i]);
}
}
virtual ~Election()
{
for(int i = 0; i < CANDIDATE_COUNT; i++)
delete m_candidates[i];
}

void SimulateElection()
{
SimulateElectionWithAbsenteeAndProxy(0, 0);
}

void SimulateElectionWithAbsenteeAndProxy(char absentee, char proxy)
{
m_matchups.clear();
if(absentee != proxy && absentee >= 'A' && absentee <= 'E')
{
GetCandidate(absentee)->SetProxy(GetCandidate(proxy));
printf("Simulating election with %c selecting %c as a proxy:\n", absentee, proxy);
}
else
{
printf("Simulating normal election:\n");
}
char winner = Vote('A', 'B');

if(absentee != proxy)
{
printf("When %c was the proxy for %c, %c won.\n\n", proxy, absentee, winner);
GetCandidate(absentee)->SetProxy(NULL);
}
else
{
printf("In a normal election, %c wins.\n\n", winner);
}
}

private:
Candidate *m_candidates[CANDIDATE_COUNT];

std::hash_map<std::string, char> m_matchups;

Candidate *GetCandidate(char c)
{
return m_candidates[c - 'A'];
}

char Vote(char c1, char c2)
{
char resultKey[4] = "XvX";
resultKey[0] = min(c1, c2);
resultKey[2] = max(c1, c2);

auto itr = m_matchups.find(resultKey);
if(itr != m_matchups.end())
{
return itr->second;
}

if(c2 < 'E')
{
// the vote boils down to a choice between the winner of
// these two candidates vs the next candidate
c1 = Vote(c1, c2 + 1);
c2 = Vote(c2, c2 + 1);
}

int c1Score = 0;
for(int i = 0; i < CANDIDATE_COUNT; i++)
{
if(GetCandidate('A' + i)->RawPreference(c1, c2))
c1Score++;
}

char winner = (c1Score >= 3) ? c1 : c2;
printf(" Evaluating %s is same as (%cv%c) = %c\n", resultKey, c1,c2, winner);
m_matchups[resultKey] = winner;

return winner;
}
};

The key to the SimulateElection function is the Vote function. Vote is recursive and hinges on these lines:

if(c2 < 'E')
{
// the vote boils down to a choice between the winner of
// these two candidates vs the next candidate
c1 = Vote(c1, c2 + 1);
c2 = Vote(c2, c2 + 1);
}

If the second candidate passed into Vote is not Eruz, then we know that we need to look ahead to find out what the results of picking each candidate would be in future rounds of voting. This is done by setting the c1 and c2 variables to be the result of recursively calling Vote with (c1, c2+1) and (c2, c2 + 1).

Once I get the results of all future round combinations, I can compare the returned c1 and c2 candidates to see which each of the candidates prefer. That candidate is then considered the winner of that matchup between the two original candidates. (This means that the stored winner of the matchup “BvC” might be “D”.)

With the number of candidates at only 5 we could safely just use the recursive method here and not worry about it, but I use memoization to keep runtime complexity reasonable; I cache previous results using a hash_map so I can quickly look them up.

Finally, once I was able to specify any candidate as an absentee with any other candidate as a proxy, I decided to check all possible absentee/proxy combinations. It should come as no surprise that candidate Eruz can’t win the election no matter what happens; however, candidate Cubio CAN win the election for several of these combinations. Here’s the output of all the different combinations:

Simulating normal election:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvD) = D
 Evaluating AvC is same as (DvD) = D
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = D
 Evaluating BvC is same as (DvD) = D
 Evaluating AvB is same as (DvD) = D
In a normal election, D wins.

Simulating election with A selecting B as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = E
 Evaluating AvD is same as (AvE) = A
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvE) = E
 Evaluating AvC is same as (AvE) = A
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvE) = B
 Evaluating BvC is same as (BvE) = B
 Evaluating AvB is same as (AvB) = B
When B was the proxy for A, B won.

Simulating election with A selecting C as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvD) = D
 Evaluating AvC is same as (DvD) = D
 Evaluating BvE is same as (BvE) = E
 Evaluating BvD is same as (EvD) = D
 Evaluating BvC is same as (DvD) = D
 Evaluating AvB is same as (DvD) = D
When C was the proxy for A, D won.

Simulating election with A selecting D as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvD) = D
 Evaluating AvC is same as (DvD) = D
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = D
 Evaluating BvC is same as (DvD) = D
 Evaluating AvB is same as (DvD) = D
When D was the proxy for A, D won.

Simulating election with A selecting E as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = E
 Evaluating AvD is same as (AvE) = A
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvE) = E
 Evaluating AvC is same as (AvE) = A
 Evaluating BvE is same as (BvE) = E
 Evaluating BvD is same as (EvE) = E
 Evaluating BvC is same as (EvE) = E
 Evaluating AvB is same as (AvE) = A
When E was the proxy for A, A won.

Simulating election with B selecting A as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = C
 Evaluating CvD is same as (CvD) = C
 Evaluating AvC is same as (DvC) = C
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = D
 Evaluating BvC is same as (DvC) = C
 Evaluating AvB is same as (CvC) = C
When A was the proxy for B, C won.

Simulating election with B selecting C as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = C
 Evaluating CvD is same as (CvD) = C
 Evaluating AvC is same as (DvC) = C
 Evaluating BvE is same as (BvE) = E
 Evaluating BvD is same as (EvD) = D
 Evaluating BvC is same as (DvC) = C
 Evaluating AvB is same as (CvC) = C
When C was the proxy for B, C won.

Simulating election with B selecting D as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvD) = D
 Evaluating AvC is same as (DvD) = D
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = D
 Evaluating BvC is same as (DvD) = D
 Evaluating AvB is same as (DvD) = D
When D was the proxy for B, D won.

Simulating election with B selecting E as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvD) = D
 Evaluating AvC is same as (DvD) = D
 Evaluating BvE is same as (BvE) = E
 Evaluating BvD is same as (EvD) = D
 Evaluating BvC is same as (DvD) = D
 Evaluating AvB is same as (DvD) = D
When E was the proxy for B, D won.

Simulating election with C selecting A as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = A
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvD) = D
 Evaluating AvC is same as (AvD) = A
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = B
 Evaluating BvC is same as (BvD) = B
 Evaluating AvB is same as (AvB) = B
When A was the proxy for C, B won.

Simulating election with C selecting B as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = E
 Evaluating AvD is same as (AvE) = A
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvE) = E
 Evaluating AvC is same as (AvE) = A
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvE) = B
 Evaluating BvC is same as (BvE) = B
 Evaluating AvB is same as (AvB) = B
When B was the proxy for C, B won.

Simulating election with C selecting D as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvD) = D
 Evaluating AvC is same as (DvD) = D
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = D
 Evaluating BvC is same as (DvD) = D
 Evaluating AvB is same as (DvD) = D
When D was the proxy for C, D won.

Simulating election with C selecting E as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = E
 Evaluating AvD is same as (AvE) = A
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvE) = E
 Evaluating AvC is same as (AvE) = A
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvE) = B
 Evaluating BvC is same as (BvE) = B
 Evaluating AvB is same as (AvB) = B
When E was the proxy for C, B won.

Simulating election with D selecting A as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = A
 Evaluating CvE is same as (CvE) = C
 Evaluating CvD is same as (CvD) = C
 Evaluating AvC is same as (AvC) = A
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = B
 Evaluating BvC is same as (BvC) = B
 Evaluating AvB is same as (AvB) = A
When A was the proxy for D, A won.

Simulating election with D selecting B as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = E
 Evaluating AvD is same as (AvE) = A
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvE) = E
 Evaluating AvC is same as (AvE) = A
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvE) = B
 Evaluating BvC is same as (BvE) = B
 Evaluating AvB is same as (AvB) = B
When B was the proxy for D, B won.

Simulating election with D selecting C as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = C
 Evaluating CvD is same as (CvD) = C
 Evaluating AvC is same as (DvC) = C
 Evaluating BvE is same as (BvE) = E
 Evaluating BvD is same as (EvD) = D
 Evaluating BvC is same as (DvC) = C
 Evaluating AvB is same as (CvC) = C
When C was the proxy for D, C won.

Simulating election with D selecting E as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = E
 Evaluating AvD is same as (AvE) = A
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvE) = E
 Evaluating AvC is same as (AvE) = A
 Evaluating BvE is same as (BvE) = E
 Evaluating BvD is same as (EvE) = E
 Evaluating BvC is same as (EvE) = E
 Evaluating AvB is same as (AvE) = A
When E was the proxy for D, A won.

Simulating election with E selecting A as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = A
 Evaluating CvE is same as (CvE) = C
 Evaluating CvD is same as (CvD) = C
 Evaluating AvC is same as (AvC) = A
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = B
 Evaluating BvC is same as (BvC) = B
 Evaluating AvB is same as (AvB) = A
When A was the proxy for E, A won.

Simulating election with E selecting B as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = A
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvD) = D
 Evaluating AvC is same as (AvD) = A
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = B
 Evaluating BvC is same as (BvD) = B
 Evaluating AvB is same as (AvB) = B
When B was the proxy for E, B won.

Simulating election with E selecting C as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = C
 Evaluating CvD is same as (CvD) = C
 Evaluating AvC is same as (DvC) = C
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = D
 Evaluating BvC is same as (DvC) = C
 Evaluating AvB is same as (CvC) = C
When C was the proxy for E, C won.

Simulating election with E selecting D as a proxy:
 Evaluating AvE is same as (AvE) = A
 Evaluating DvE is same as (DvE) = D
 Evaluating AvD is same as (AvD) = D
 Evaluating CvE is same as (CvE) = E
 Evaluating CvD is same as (EvD) = D
 Evaluating AvC is same as (DvD) = D
 Evaluating BvE is same as (BvE) = B
 Evaluating BvD is same as (BvD) = D
 Evaluating BvC is same as (DvD) = D
 Evaluating AvB is same as (DvD) = D
When D was the proxy for E, D won.

And here’s the entire SecretElection.cpp file:

// SecretElection.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <hash_map>
#include <string>

#define min(a,b) ((a) < (b)) ? (a) : (b)
#define max(a,b) ((a) > (b)) ? (a) : (b)

#define CANDIDATE_COUNT 5

class Candidate
{
char m_id;
int m_scores[CANDIDATE_COUNT];
const Candidate *m_proxy;

public:
Candidate(const char *rankings) : m_proxy(NULL)
{
m_id = rankings[0];
for(int cand = 0; cand < CANDIDATE_COUNT; cand++)
{
m_scores[rankings[cand] - 'A'] = CANDIDATE_COUNT - cand;
}
}
// allow another candidate to vote for me
void SetProxy(const Candidate *proxy) {m_proxy = proxy;}

bool RawPreference(char a, char b) const
{
// if a proxy has been designated, ask it
if(m_proxy != NULL && m_proxy != this)
return m_proxy->RawPreference(a, b);

// otherwise return true if our score for the first candidate is
// greater than the second's score
if(m_scores[a - 'A'] > m_scores[b - 'A'])
return true;
return false;
}
};

class Election
{
public:
Election()
{
const char *c_rankings[] = {
"ABCDE", "BAEDC", "CDAEB", "DBAEC", "EDBCA"
};

for(int i = 0; i < CANDIDATE_COUNT; i++)
{
m_candidates[i] = new Candidate(c_rankings[i]);
}
}
virtual ~Election()
{
for(int i = 0; i < CANDIDATE_COUNT; i++)
delete m_candidates[i];
}

void SimulateElection()
{
SimulateElectionWithAbsenteeAndProxy(0, 0);
}

void SimulateElectionWithAbsenteeAndProxy(char absentee, char proxy)
{
m_matchups.clear();
if(absentee != proxy && absentee >= 'A' && absentee <= 'E')
{
GetCandidate(absentee)->SetProxy(GetCandidate(proxy));
printf("Simulating election with %c selecting %c as a proxy:\n", absentee, proxy);
}
else
{
printf("Simulating normal election:\n");
}
char winner = Vote('A', 'B');

if(absentee != proxy)
{
printf("When %c was the proxy for %c, %c won.\n\n", proxy, absentee, winner);
GetCandidate(absentee)->SetProxy(NULL);
}
else
{
printf("In a normal election, %c wins.\n\n", winner);
}
}

private:
Candidate *m_candidates[CANDIDATE_COUNT];

std::hash_map<std::string, char> m_matchups;

Candidate *GetCandidate(char c)
{
return m_candidates[c - 'A'];
}

char Vote(char c1, char c2)
{
char resultKey[4] = "XvX";
resultKey[0] = min(c1, c2);
resultKey[2] = max(c1, c2);

auto itr = m_matchups.find(resultKey);
if(itr != m_matchups.end())
{
return itr->second;
}

if(c2 < 'E')
{
// the vote boils down to a choice between the winner of
// these two candidates vs the next candidate
c1 = Vote(c1, c2 + 1);
c2 = Vote(c2, c2 + 1);
}

int c1Score = 0;
for(int i = 0; i < CANDIDATE_COUNT; i++)
{
if(GetCandidate('A' + i)->RawPreference(c1, c2))
c1Score++;
}

char winner = (c1Score >= 3) ? c1 : c2;
printf(" Evaluating %s is same as (%cv%c) = %c\n", resultKey, c1,c2, winner);
m_matchups[resultKey] = winner;

return winner;
}
};

int _tmain(int argc, _TCHAR* argv[])
{
Election e;

e.SimulateElection();
for(char absentee = 'A'; absentee <= 'E'; absentee++)
{
for(char proxy = 'A'; proxy <= 'E'; proxy++)
{
if(absentee != proxy)
e.SimulateElectionWithAbsenteeAndProxy(absentee, proxy);
}
}

return 0;
}

Thanks for reading!

UPDATE: For a dynamic programming example of how to solve this, see my next post. And since WordPress has once again destroyed my code formatting, here’s a post with all the latest links to the code.

Quick Follow-up to River Crossing

Thanks to commenter Alex Mahdavi who wrote:

Building on your Bridg-it game, I think of this problem in terms of routes. Either there will be a successful route across the river, or there won’t. Now imagine a tall ship is coming down the river, whose mast is too tall for the ship to pass under any bridges. From the ship’s perspective, it is also looking for a successful route- a navigable route downstream past the islands. Similar to the two sides in Bridg-it, there can either be a successful land route, or a successful water route, but there cannot be both (any land route would block any water route, and vice versa).

Yes, thank you! This is exactly what I was trying to say (but didn’t really spell it out at all) when I brought up Bridg-It and why I suspected that the answer would be 50%. Bridg-It says right on its box “Never a tie – always a winner” which means once all the bridge pieces are played, there will either be a North-South route or an East-West route, but not both. Because of this I suspected immediately that the answer must be 50%.

Here’s a better picture of Bridg-It:

BridgitBoard

Once a piece is played, it is basically saying that either a bridge has survived the storm, or it hasn’t. Either Alex’s tall-masted ship can pass through that space, or a person can walk across the bridge, but not both. From both the North-South and East-West perspectives, the game is played on a grid of NxN+1 “islands.”

This game had been donated to my elementary school along with a bunch of other games, some with pieces missing, that lived in a box in the back of the classroom. It was already old even back then. We played it for a while before we all kind of figured out that whoever goes first wins, so it was neither “new” nor “unpredictable.” Yes, it was flawed from a turn-based-game-play point of view, but as a mathematical model, it was balanced, so 50% seemed likely. Despite its flaws, it did (inadvertently?) introduce us to some logic and geometric concepts. If nothing else, it gave me a clue about a logic puzzle many years later. Board games! They’re important!

UPDATE TO THE UPDATE: I think I’ve got the C++ formatting working properly on the River post after many attempts. WordPress’s editor will sometimes randomly replace all special characters like ” and < and > with their HTML-escape code sequences like &quot; and &gt;. It’s awesome! Every time you edit a post, you have the seemingly random chance that all your code will be turned to gibberish. Anyway, I think it’s okay now, so I won’t be touching that post ever again.

Also, for some reason I thought the title Bridg-It had an exclamation point, but apparently that was my own enthusiasm for the game clouding my perceptions. I fixed it in this post, but I’m not editing that last one.

The 538 Riddler: River Crossing

Well, the answer was published to last week’s Riddler, and I was right. I’m two-for-two so far, so I might as well keep going. This week’s Riddler is a real fun one because I got to do some basic pathfinding. Here’s the problem from Ollie:

You’re on the north shore of a river, and want to cross to the south, via a series of 13 bridges and six islands, which you can see in the diagram below. But, as you approach the water, night falls, a bad storm rolls in, and you’re forced to wait until morning to try to cross. Overnight, the storm could render some of the bridges unusable — it has a 50 percent chance of knocking out each of the bridges. (The chance is independent for each bridge.)

roeder-riddler-diagram-11

Question 1: What’s the probability you will be able to cross the river in the morning? (You have no boat, can’t swim, can’t fix the bridges, etc. No tricks.)

Now imagine a different, wider river, with many more islands — in fact, arbitrarily many. Specifically, imagine that the islands are arrayed in an N-rows-by-N+1-columns grid — similar to before, where N happened to equal two — and connected by bridges to each adjacent island in the same way. Each island adjacent to the shore is also connected by a bridge to the shore. It would look something like this:

roeder-riddler-diagram-21

Question 2: What’s the probability you’ll be able to cross this river in the morning, after the same storm — with the same independent 50 percent chance of knocking out each bridge — rolls through?

Cool!

I read this Friday morning and got to think about it all day at work before I could actually do any work on it. Even before I did any programming I was already pretty sure what the answers to questions one and two were: 50%. This is not because of any great critical thinking on my part, but because of a board game I used to play way back in the day called Bridg-It!

Bridgit

Bridg-It! was one of those games that seem really fun until you discover that whoever goes first always wins, unless a mistake is made. It consists of two sets of interleaved NxN+1 “islands” that belong to each player. Placing a bridge between two of your islands blocks your opponent from placing a bridge between two of theirs. Whoever completes a bridge from their start to destination first is the winner. One player is moving “North-to-South” while the other move “East-to-West.” Each player is playing on an NxN+1 board while defending against an N+1xN board. It’s neat.

Anyway, I suspected that the answer was going to be 50% for any arbitrary NxN+1 group of islands so I set out to test it. One way to handle these “N+1” types of problems is to reduce N to its simplest form first, then work from there. The simplest form for N would be N=0, giving a river with 0 rows of islands and 1 column of bridges, or a single bridge across a river. Obviously if a storm has a 50% chance of knocking out the bridge, you’ll have a 50% chance of crossing in the morning.

So the next step is a river with 1 row of two islands, as below:

RiverCrossing0

Now I decided to modify the story a bit to make it a little more realistic.

A certain governor has decided to block access to bridges in towns in which their mayors refused to endorse him in his reelection campaign. If, on average, 50% of the mayors chose to endorse the governor, what are the chances that commuters will be able to travel from the South bank to the North bank? (Way more realistic than storms knocking out bridges. C’mon!)

Since there are 5 bridges in our map, and each bridge may be blocked (with a traffic cone) or clear, there are 32 (2^5) possible combinations to consider, each equally likely. Here they are:

RiverCrossing

We can quickly count that there are 16 happy commuter faces and 16 angry commuter faces. Bad governor!

Once you get past a 1×2 problem, the number of bridges grows pretty quickly. For 2×3 it grows to 13 bridges and for 3×4 it’s 25 bridges. In fact, the number of bridges is defined as (rows + 1) * columns + (columns – 1) * rows. If you stick to NxN+1 arrangements this becomes N-squared + (N+1)-squared, but there’s no reason to do that, except that it has the interesting (proposed) property of always being 50%.

I wrote a program (of course) to calculate the odds of being able to cross the river given arbitrary island arrangements, even allowing you to specify the failure rate of bridges (or endorsement rate of mayors.)

Calculating whether a path to the goal is present is straightforward for a 1×2 or even a 2×3 arrangement, but gets a little more complicated as the number of bridges increases. I wrote a simple breadth-first search (no need for A* here) to do the trick. Here’s the code:

// RiverCrossing.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <vector>
#include 	<list>
#include <hash_set>

using namespace std;

enum EDir
{
    D_North,
    D_East,
    D_South,
    D_West,
};

struct Island
{
    int m_row;
    int m_col;

    Island(int row = 0, int col = 0) : m_row(row), m_col(col) {}
};

class River
{
public:
    static int BridgeCount(int rows, int cols)
    {
        return (rows + 1) * (cols) + (rows) * (cols - 1);
    }

    River(int rows, int cols) : m_rows(rows), m_cols(cols)
    {
        int bridgeCount = BridgeCount(rows, cols);
        for(int i = 0; i < bridgeCount; i++)
        {
            m_bridges.push_back(true);
        }
        m_goodBridges = bridgeCount;
    }

    void Storm(double destroyPct)
    {
        for(auto itr = m_bridges.begin(); itr != m_bridges.end(); itr++)
        {
            double d = (double)rand() / (double)RAND_MAX;
            if(d < destroyPct)
            {
                *itr = false;
                m_goodBridges--;
            }
        }
    }

    void NthPermutation(unsigned long long perm)
    {
        // we can only handle rivers with <= 64 bridges (5 x 6)         if(m_bridges.size() > sizeof(perm) * 8)
            return;
        m_goodBridges = 0;
        for(int i = 0; i < (int)m_bridges.size(); i++)
        {
            if((perm & (1LL << i)) != 0)
            {
                m_bridges[i] = true;
                m_goodBridges++;
            }
            else
            {
                m_bridges[i] = false;
            }
        }
    }

    bool CanCross() const
    {
        // weed out some obvious cases
        if(m_goodBridges < m_rows) // not enough bridges to go straight across             return false;         if(m_goodBridges > (int)m_bridges.size() - m_cols) // not enough out bridges to block path 
            return true;

        hash_set<int> islandsVisited;

        list<Island> listToVisit;
        listToVisit.push_back(Island(-1, 0));

        while(!listToVisit.empty())
        {
            Island current = listToVisit.front();
            listToVisit.pop_front();

            list<Island> neighbors = GetNeighbors(current);

            for(auto itr = neighbors.begin(); itr != neighbors.end(); itr++)
            {
                if(islandsVisited.find(IslandIndex(*itr)) != islandsVisited.end())
                    continue;
                if(itr->m_row >= m_rows)    // we've crossed the river!
                    return true;
                islandsVisited.insert(IslandIndex(*itr));
                listToVisit.push_back(*itr);
            }
        }

        return false;
    }

    void PrintRiver()
    {
        printf("-----------------------------\n");
        for(int row = 0; row <= m_rows; row++)
        {
            for(int col = 0; col < m_cols; col++)
            {
                Island isle(row, col);
                if(CheckBridge(isle, true))
                {
                    printf(" |    ");
                }
                else
                {
                    printf("      ");
                }
            }
            printf("\n");
            if(row == m_rows)
                break;

            printf("XXX");
            for(int col = 1; col < m_cols; col++)
            {
                Island isle(row, col);
                if(CheckBridge(isle, false))
                {
                    printf("===XXX");
                }
                else
                {
                    printf("   XXX");
                }
            }
            printf("\n");
        }
        printf("-----------------------------\n");
    }

    static double TestCanCross(int rows, int cols, double destroyPct)
    {
        if(rows < 0 || cols <= 0)
            return 0;
        int bridgeCount = BridgeCount(rows, cols);
        printf("Testing river with %dx%d islands, %d bridges, %.2f%% chance of bridge failure:\n",
            rows, cols, bridgeCount, 100.0 * destroyPct);

        int trials = 0;
        int crossings = 0;

        if(bridgeCount <= 25 && destroyPct == 0.5)
        {
            // we can permutate for up to 3 x 4 if destroyPct is 50%
            trials = 1 << bridgeCount;
            crossings = 0;
            River river(rows, cols);
            for(int i = 0; i < trials; i++)
            {
                river.NthPermutation(trials - i - 1);
                if(river.CanCross())
                {
                    crossings++;
                }
            }
            printf("Crossings = %d of %d permutations (%.3f%%)\n\n", crossings, trials, (double)crossings * 100.0 / (double)trials);
        }
        else
        {
            trials = 10000;
            crossings = 0;
            for(int i = 0; i < trials; i++)
            {
                River river(rows, cols);
                river.Storm(destroyPct);
                if(river.CanCross())
                {
                    crossings++;
                }
            }
            printf("Crossings = %d of %d Monte Carlo trials (%.3f%%)\n\n", crossings, trials, (double)crossings * 100.0 / (double)trials);
        }

        return (double)crossings / (double)trials;
    }

private:
    int m_rows;        // vertical islands 
    int m_cols;        // horizontal islands 
    int m_goodBridges;

    vector<bool>    m_bridges;    // (m_rows + 1) * (m_cols) + (m_rows) * (m_cols - 1)

    int    BridgeIndex(const Island &isle, bool vertical) const
    {
        if(vertical)
        {
            if(isle.m_col < 0 || isle.m_col >= m_cols)
                return -1;
            if(isle.m_row < 0 || isle.m_row > m_rows)
                return -1;
            return isle.m_row * m_cols + isle.m_col;
        }
        else
        {
            if(isle.m_col < 1 || isle.m_col >= m_cols)
                return -1;
            if(isle.m_row < 0 || isle.m_row >= m_rows)
                return -1;
            return (m_rows + 1) * m_cols + isle.m_row * (m_cols - 1) + isle.m_col - 1;
        }
    }

    bool CheckBridge(const Island &isle, bool vertical) const
    {
        int idx = BridgeIndex(isle, vertical);
        if(idx < 0 || idx >= (int)m_bridges.size())
            return false;
        return m_bridges[idx];
    }

    int IslandIndex(const Island &isle) const 
    {
        if(isle.m_row < 0)             return -1;         if(isle.m_row >= m_rows)
            return m_rows * m_cols;
        return isle.m_row * m_cols + isle.m_col;
    }

    list<Island> GetNeighbors(const Island &isle) const
    {
        Island next;
        list<Island> neighbors;

        // special case if isle.m_row == -1, then it's north shore
        if(isle.m_row < 0)
        {
            next.m_row = 0;
            for(next.m_col = 0; next.m_col < m_cols; next.m_col++)             {                 if(CheckBridge(next, true))                 {                     neighbors.push_back(next);                 }             }             return neighbors;         }         // get north island         if(isle.m_row > 0)
        {
            // check vertical bridge to current island
            if(CheckBridge(isle, true))
            {
                next.m_row = isle.m_row - 1;
                next.m_col = isle.m_col;
                neighbors.push_back(next);
            }
        }
        // get south island
        if(isle.m_row <= m_rows)         {             next.m_row = isle.m_row + 1;             next.m_col = isle.m_col;             // check vertical bridge to south island             if(CheckBridge(next, true))                 neighbors.push_back(next);         }         // get west island         if(isle.m_col > 0)
        {
            // check horizontal bridge to current island
            if(CheckBridge(isle, false))
            {
                next.m_row = isle.m_row;
                next.m_col = isle.m_col - 1;
                neighbors.push_back(next);
            }
        }
        // get east island
        if(isle.m_col < m_cols)
        {
            next.m_col = isle.m_col + 1;
            next.m_row = isle.m_row;
            if(CheckBridge(next, false))
            {
                neighbors.push_back(next);
            }
        }

        return neighbors;
    }
};

int _tmain(int argc, _TCHAR* argv[])
{
    River::TestCanCross(1, 2, 0.5);
    River::TestCanCross(2, 3, 0.5);
    River::TestCanCross(3, 4, 0.5);
    River::TestCanCross(10, 11, 0.5);
    River::TestCanCross(20, 21, 0.5);

    return 0;
}

This code is a little more “C-plus-plusy” than the previous two problems; I made a class called River which takes parameters of how many rows and columns of islands to add. It then creates the necessary bridges. A Storm() function takes a parameter for how likely it is to destroy a bridge and then it applies that to all the bridges on the River. A CanCross function uses the breadth-first search to find a path if it exists. Finally, a static function called TestCanCross will take those parameters and do an exhaustive test to find how often a path can be found.

A note on programming and probability and permutations and Monte Carlos. If you can “do the math” to find a solution, that’s preferred. For example, if you want to calculate the odds of flopping a set in poker when you hold a pocket pair, it’s best to know that the odds are: 1 – (48 * 49 * 50) / (50 * 51 * 52) = 11.756%. Similarly if you want to calculate the odds of flopping quads, it’s best to know that the odds are: 1 / (50 * 49) but multiplied by 3-factorial (6) because the order of the cards in the flop don’t matter, so 0.244%. If you don’t know the math but know all the permutations (and they are equally likely), then simulating all the permutations is better. Permutations can give you the precise answer, but can be slow.

If you don’t know the math and the permutations are unknown or too vast or not evenly distributed, then the last refuge of the scoundrel simulator programmer is the Monte Carlo, usually with some number of trials like 10,000. Monte Carlos will have more variance with fewer trials, so you need to balance the need for speed with the need for precision. I didn’t really feel like doing the Boolean Algebra to prove this answer, so I did a combination of permutations and Monte Carlos. The number of permutations for a given river is 2^(number of bridges) assuming that the chance of bridge failure is 50%. The TestCanCross function checks to see if the failure rate is 50% and if the number of bridges is less than or equal to 25 (so I could do 3×4). If so, it uses permutations and gives an exact number. A 3×4 configuration results in 25 bridges and 33 million permutations. Anything more than 25 bridges and the program says, “lol, no, it’s Monte Carlo time” and does 10,000 trials. With that in mind, here’s the output:

Testing river with 1x2 islands, 5 bridges, 50.00% chance of bridge failure:
Crossings = 16 of 32 permutations (50.000%)
Testing river with 2x3 islands, 13 bridges, 50.00% chance of bridge failure:
Crossings = 4096 of 8192 permutations (50.000%)
Testing river with 3x4 islands, 25 bridges, 50.00% chance of bridge failure:
Crossings = 16777216 of 33554432 permutations (50.000%)
Testing river with 10x11 islands, 221 bridges, 50.00% chance of bridge failure:
Crossings = 5007 of 10000 Monte Carlo trials (50.070%)
Testing river with 20x21 islands, 841 bridges, 50.00% chance of bridge failure:
Crossings = 4963 of 10000 Monte Carlo trials (49.630%)

This was a lot of fun, although technically I didn’t prove the answer; I just showed that for all tested NxN+1 configurations the answer remains the same. I’m a little embarrassed to admit that I spent more time working on the bad programmer art than I did programming, but what the heck, it was fun. I look forward to seeing the mathematical proof of what I demonstrated.

UPDATED: UUUUUGGGGGGGHHHHHH. WordPress, why are you so awful? I can’t get the spacing right in the code section. I hand edited each line to add spaces, saved, published, and… no change. I’ll try to get it fixed this weekend, but not right now. Copy/pasta then run it through beautify if you are a coder…

Follow-up to the Neurotic Shooter

I think I got this one. Sometime after I posted yesterday about the free throw shooter with the remarkable neurosis, someone tweeted at me that they agreed with me, and that they couldn’t figure out how so many people were getting 50.5%. It seems that Ollie at 538 had tweeted this:

The speculation is that the highest bar (which Ollie confirmed represents the correct answer) is 2/3, the 2nd highest is 50.5%, and who knows what the other two answers are. I’d really like to know. (BTW, if that chart is a Harry Potter Sorting Hat, then the houses are: Ravenclaw, Slytherin, Gryffindor, and Hufflepuff, respectively. Fight me.)

I can fully understand how so many people could get 50.5%. It all comes back to that #define statement from my previous post:

#ifdef THE_WRONG_WAY
 // This code might seem correct but will not correctly "prune" the
 // tree of possibilities
 shots++;
 made++;
#else
 // now ensure that we make the "seen" shot (eg 99)
 if(SimShot(shots, made) != NS_Made)
 return NS_Invalid;
#endif

I showed that code to a colleague and it took some convincing that “throwing out” half the trials because they resulted in a missed 99th shot really was the correct method. Yes, each path is equally likely, but the fact that we know the 99th shot was made indicates that it is much more likely that the player has had more success in the past.

I can imagine people taking several paths to an answer for this one. Some percentage would immediately say “2/3, easy” and not think anything of it, and it would turn out they are right. Another group would write some sort of simulator and would get to that 99th shot and just credit it to the shooter, and they’d get 50.5%. And still another group would take the approach I’ve outlined and come up with 2/3 again.

Then there’s the other two answers, and I have no idea what happened there. Gryffindor and Hufflepuff, what can you do, amirite?

(Having said that, I still won’t be 100% sure I am right until I see a better explanation.)

The 538 Riddler: The Neurotic Shooter

What are the chances that I will write two blog posts only a week apart? Well, last week’s post was a lot of fun (for me) so I decided to do another one. In that post, I wrote some code to answer FiveThirtyEight.com’s riddle about a sadistic car salesman. As I wrote last week, I wasn’t entirely sure my solution was correct, but it turned out that it was, and so I got a shout-out in this week’s Riddler (at the bottom). There were also some comments from people who had independently came to the same answer that I did, and I really enjoyed reading them, so thanks for those.

This week’s Riddler concerns a basketball player with a remarkable neurosis:

A basketball player is in the gym practicing free throws. He makes his first shot, then misses his second. This player tends to get inside his own head a little bit, so this isn’t good news. Specifically, the probability he hits any subsequent shot is equal to the overall percentage of shots that he’s made thus far. (His neuroses are very exacting.) His coach, who knows his psychological tendency and saw the first two shots, leaves the gym and doesn’t see the next 96 shots. The coach returns, and sees the player make shot No. 99. What is the probability, from the coach’s point of view, that he makes shot No. 100?

Okay, two things right off the bat. First, according to the rule specified, any time this player made a single shot, that should determine ALL future shots. He would either make 100% or miss 100%, and it wouldn’t be possible to make the first one and miss the second one. But that would be pretty boring, so let’s ignore that.

Second, after reading the problem my immediate reaction was, “Oh, it’s 2/3, or 66.7%. That seems obvious. It even says, ‘from the coach’s point of view’ so that’s what it is. That’s too easy; it must be wrong.”

After thinking about it some more, I decided that it might be more likely that the answer was something like 50/99, or 50.5%. So I wrote a Monte Carlo program to simulate this neurotic hoopster and ran it. And the answer is… 2/3. Or at least that’s what I think it is. Here’s the code:

// NeuroticShooter.cpp : Defines the entry point
// for the console application.

#include "stdafx.h";
#include <stdlib.h>;
#include <time.h>;

enum EShotResult {
    NS_Miss,
    NS_Made,
    NS_Invalid,
};

int SimShot(int &shots, int &made)
{
    // take a random number between 0 and shots so far (eg 2)
    // if  less than the number we have currently made (eg 1)
    // then we made it
    int shotValue = rand() % shots;
    shots++;
    if(shotValue < made)
    {
        made++;
        return NS_Made;
    }
    return NS_Miss;
}

int MakesLastShot(int thirdSeen, int lastShot)
{
    // we start with the assumption that the first two
    // shots were taken and one was made
    int shots = 2;
    int made = 1;

    // now we simulate the unseen shots
    while(shots < thirdSeen)
    {
        SimShot(shots, made);
    }

#ifdef THE_WRONG_WAY
    // This code might seem correct but will not correctly
    // "prune" the tree of possibilities
    shots++;
    made++;
#else
    // now ensure that we make the "seen" shot (eg 99)
    if(SimShot(shots, made) != NS_Made)
        return NS_Invalid;
#endif

    // simulate any additional unseen shots
    // (this will be 0 in given problem)
    while(shots < lastShot)
    {
        SimShot(shots, made);
    }

    // now simulate final shot and return its success
    return SimShot(shots, made);
}

void MC_Trial(int thirdSeen, int lastShot)
{
    int trials = 10000;
    int made = 0;
    int validTrials = 0;

    for(int i = 0; i < trials; i++)
    {

        switch(MakesLastShot(thirdSeen, lastShot))
        {
        case NS_Invalid:
            break;
        case NS_Made:
            made++;
            // fall through!
        case NS_Miss:
            validTrials++;
        }
    }
    printf("Coach sees: Make, Miss, [%d unseen], Make, [%d unseen].\n  Player then makes last shot %.3f%% of time.\n\n"
        , thirdSeen - 2, lastShot - thirdSeen - 1, (double)made * 100.0 / (double)validTrials);
}

int _tmain(int argc, _TCHAR* argv[])
{
    srand((unsigned int)time(NULL));

    // shot indices are 0-based, so "2" = third shot,
    // "98" = 99th shot, "99" = 100th shot
    MC_Trial(98, 99);

    return 0;
}

And here’s the output:

Coach sees: Make, Miss, [96 unseen], Make, [0 unseen].
 Player then makes last shot 66.379% of time.

So, yeah, 2/3 (give or take some expected Monte Carlo variance). Now if you look at that code you can see a #ifdef compiler directive that makes a BIG difference in what we get as an answer.

#ifdef THE_WRONG_WAY
 // This code might seem correct but will not correctly "prune" the
 // tree of possibilities
 shots++;
 made++;
#else
 // now ensure that we make the "seen" shot (eg 99)
 if(SimShot(shots, made) != NS_Made)
 return NS_Invalid;
#endif

If we simulate 98 free throws, then “give credit” for a made shot, then calculate the odds of making that last free throw, then the answer is… 50.5%. But that’s not right. At least, I don’t think it is.

Instead, we need to simulate the 99th free throw and throw out the entire series if it is missed. We only count trials where the 99th free throw is made “naturally.” Why? Well, because we are trying to simulate what will happen by following these rules. And the universe in which the player makes the 99th free throw is different than the universe where he misses it, and our simulation needs to exist in only the universe where he makes it.

Think of it this way. Each shot that the player takes presents a fork in a big tree of nodes representing the multiverse of possibilities. And we need to prune all the universes where that 99th free throw was missed. That’s why our function returns 3 values instead of just two. Some of the simulations are marked as invalid, so we don’t count them in our total number of simulations. And when we only count the simulations where our player “naturally” makes the 99th free throw, he goes on to make the 100th free throw 2/3 of the time.

Is that the correct way to look at this? I think so, but maybe we’ll find out next Friday that I was wrong.

There are some other interesting (for certain nerdy definitions of “interesting”) things about this riddle. First, if my method is the correct one, it doesn’t matter at what point the coach walks back into the gym to observe the third shot. For example, if the coach sees two shots, leaves for 48 shots, comes back to observe a made shot, then leaves for another 48 unseen shots, the chances of the player hitting that 100th shot will still be 2/3.

This can be illustrated by calling the MC_Trial function with different values:

 MC_Trial(50, 99);
 MC_Trial(2, 99);

Output:

Coach sees: Make, Miss, [48 unseen], Make, [48 unseen].
 Player then makes last shot 68.030% of time.

Coach sees: Make, Miss, [0 unseen], Make, [96 unseen].
 Player then makes last shot 66.192% of time.

(Again, the answer is 66.7% with some Monte Carlo variance.)

Also, because of the nature of this particular player’s neurosis, when you get to the end of the 100 shots, the chances for any particular “node” on the tree, or rather the chances that the player will score any particular score between 1 made shot and 99, are exactly equal. That’s pretty counter-intuitive to think that a player has an equal chance to make 1/100 and 99/100 shots, but once again this is a remarkable player.

I wrote a function to prove this. It’s a decent example of dynamic programming, where the results of the next round of data is built using data from previous rounds. Here’s the code:

void PerformanceOdds()
{
    int shots = 2;
    double chanceN[101] = {0, 1, 0};
    // after 2 shots, 0% chance of 0, 100% chance of 1, 0% chance of 2
    for(; shots < 100; shots++)     
    {         
        for(int n = shots; n >= 1; n--)
        {
            // the chance of getting to n made shots by missing from prev state
            double missChance = chanceN[n] * (double)(shots - n) / (double) shots;
            // the chance of getting to n made shots by making it from prev state
            double hitChance = chanceN[n - 1] * (double)(n - 1) / (double)shots;
            chanceN[n] = missChance + hitChance;
         }
    }
    for(int i = 0; i <= 100; i++)
    {
        printf("Chance of shooting %d/%d = %.3f%%\n", i, 100, 100.0 * chanceN[i]);
    }
}

And the output:

Chance of shooting 0/100 = 0.000%
Chance of shooting 1/100 = 1.010%
Chance of shooting 2/100 = 1.010%
Chance of shooting 3/100 = 1.010%
Chance of shooting 4/100 = 1.010%
Chance of shooting 5/100 = 1.010%
….
Chance of shooting 95/100 = 1.010%
Chance of shooting 96/100 = 1.010%
Chance of shooting 97/100 = 1.010%
Chance of shooting 98/100 = 1.010%
Chance of shooting 99/100 = 1.010%
Chance of shooting 100/100 = 0.000%

Finally, I tweeted this to @ollie at 538:

What I meant by that is that in some ways, what we have simulated here is a pollster taking a random sample of a population. Before taking the poll, each outcome is equally likely, but once we take that sample, we expect that future responses (and actual opinion) will be consistent with that sample, within an expected margin of error. If our sample is truly random, that is what we can expect, but if not, well, our results might show 50.5% when the answer is really 66.7%. Since polls, and samples, and people’s expectations of (and reaction to) them is of great concern to FiveThirtyEight, I thought maybe this week’ Riddler might be a bit pointed.

Then again, I could be wrong!

NOTE: WordPress really sucks for displaying formatted C++ code. I’m trying to figure out how to get it to display properly but haven’t had much luck. If anyone has any tips, I’d appreciate it. I think I figured it out. Still not thrilled with it, but at least it’s readable now.