The Iterated Iterated Iterated Prisoners Dilemma

Round Four Results

1 WoodsieLordOTFT 116327 55.14%
2 WoodsieLordTFT 0.8 115396 54.70%
3 Oveyeta_mixt 113203 53.66%
4 GRIM 112275 53.22%
5 Oveyeta 108260 51.32%
6 TITFORTAT 108205 51.29%
7 nice_guy 105280 49.90%
8 TITFOR2TAT 104843 49.70%
9 Secret 104157 49.37%
10 PAVLOV 102024 48.36%
11 COOP100 97461 46.20%
12 NOISYTFT 87933 41.68%
13 COOP0 82382 39.05%
14 GLOBAL 79168 37.53%
15 COOP30 78888 37.39%
16 ALTERNATE 74424 35.28%
17 COOP70 73753 34.96%
18 COOP50 73694 34.93%

Round Three Results

1 WoodsieLordOTFT 108596 54.96%
2 WoodsieLordTFT 0.8 107155 54.23%
3 GRIM 103240 52.25%
4 DoctorWho 101060 51.14%
5 TITFORTAT 100650 50.94%
6 TITFOR2TAT 97153 49.17%
7 Secret 96759 48.97%
8 nice_guy 96673 48.92%
9 PAVLOV 94963 48.06%
10 COOP100 89760 45.43%
11 NOISYTFT 87434 44.25%
12 COOP0 79332 40.15%
13 COOP30 78610 39.78%
14 ALTERNATE 77893 39.42%
15 GLOBAL 77793 39.37%
16 COOP50 74511 37.71%
17 COOP70 70925 35.89%

Round Two Results

1 WoodsieLordTFT 0.8 114080 55.81%
2 WoodsieLordOTFT 113794 55.67%
3 GRIM 108613 53.14%
4 Secret 106414 52.06%
5 TITFORTAT 105669 51.70%
6 DoctorWho 105565 51.65%
7 TITFOR2TAT 100749 49.29%
8 PAVLOV 99303 48.58%
9 Everywheel 97618 47.76%
10 COOP100 94410 46.19%
11 NOISYTFT 92946 45.47%
12 COOP0 90916 44.48%
13 COOP30 85070 41.62%
14 ALTERNATE 83975 41.08%
15 GLOBAL 83394 40.80%
16 COOP50 79011 38.66%
17 COOP70 76016 37.19%

Round One Results

1 WoodsieLordOTFT 107561 54.77%
2 Secret 106888 54.42%
3 WoodsieLordTFT 0.8 106789 54.37%
4 GRIM 104420 53.17%
5 DoctorWho 99812 50.82%
6 PAVLOV 99445 50.63%
7 TITFORTAT 99257 50.54%
8 TITFOR2TAT 94604 48.17%
9 COOP100 88977 45.30%
10 NOISYTFT 87462 44.53%
11 Everywheel 82868 42.19%
12 COOP30 81630 41.56%
13 COOP0 80548 41.01%
14 ALTERNATE 78527 39.98%
15 GLOBAL 77105 39.26%
16 COOP50 76837 39.12%
17 COOP70 72244 36.78%

Simple algorithms included:

  • ALTERNATE: Swaps between cooperating and defecting.
  • COOPn: Randomly cooperates each play with probability n (so COOP0 always defects, COOP100 always cooperates, COOP50 is a coinflip).
  • GLOBAL: Randomly cooperates each play with probability equal to the proportion of cooperations by opponents seen in all rounds so far.
  • GRIM: Never defects first, but if the opponent ever defects, will defect on every remaining game in that round.
  • NOISYTFT: Plays Tit-for-Tat, but changes its mind 10 percent of the time.
  • PAVLOV: If it scored 3 or 5 points last round (before multiplying by the factor), make the same play; otherwise, make the opposite play.
  • TITFORTAT: Starts by cooperating, then does whatever the opponent did last time.
  • TITFORTWOTAT: Cooperates unless the previous two moves by opponent were both defection.

Selected content of the original email:

The iterated prisoners dilemma is interesting because it allows for notions of trust and reputation to be built up – adding an extra dimension to the standard PD, whereby mutual cooperation can arise instead of mutual defection as a best strategy.

The tournament we will be taking part in goes a step further – with five ’rounds’, each comprised of a 200 ‘game’ iterated prisoners dilemma. An iterated iterated prisoners dilemma, if you will. Now it’s possible to learn IPD strategies – given your behaviour in round one, I might approach you differently in round two. But you’re still going in facing an unknown field of opponents.

So; I am proposing an iterated iterated iterated prisoner’s dilemma. That is, I am willing to run a series of IIPD tournaments; once a week until the submission deadline. In this way, we can investigate the metagame – whilst strategy A might be the best response to A and B, if everyone seems to be playing C and D, maybe A isn’t the right thing to do any more.

Here’s how it will work. Each Monday evening1 I will run a tournament (5 round, 200 games, as usual) with the latest version of code sent to me by each participant. I will make the overall leaderboard available to everyone here; and for each participant, I will send them the full history of all the plays they were involved in. That is, for every other player in the tournament, you will receive 1000 data lines in csv format: your identity, their identity, the game (n), their play, your play, and the factor that applied for that game.

But wait, I hear you cry – won’t that mean that Graeme gets to see everyone’s code, and how everyone is doing against everyone else?

Absolutely. In a fair world, I would of course give everyone all of the data. Instead, I am offering a new game 🙂 Your choices are to play – and get as payoff O(n) information about your possible competitors, whilst I get O(n^2) information; or not to play, in which case we both get nothing. You may find the ultimatum game2 instructive. I claim your optimal strategy is to play.

As additional incentive, I will include various standard programs – TitFor(Two)Tat, Grim, assorted random strategies etc. so you can see how you perform against those. Since I am running a 96% average for CORF, I am at this stage more interested in an entertaining game than scoring highly, so you may consider me a relatively neutral gamesmaster3 . I expect and encourage rival leagues to form instead.

I should probably come up with some rules, and reserve the right to modify these as I see fit. For starters:

  • I will require your source code, as a java file (not a compiled class). This is because I can’t be bothered to set up a bulletproof virtual server to run arbitrary code, so this will work somewhat on the honour system. If I suspect that your program is trying to erase my holiday photos, cause an international diplomatic incident, or solve the halting problem, then I won’t run it. Consider this an exercise in good code documentation – although I don’t need to understand your prisoner’s dilemma algorithm, I need be confident it’s not going anywhere near the filesystem / a comms device / the pentagon, and will err on the side of caution here.
  • You should probably change the string your program uses to something other than your matriculation number. One, I’m not sure if I should really be harvesting those; Two, it’ll mean that if someone figures out your algorithm, they still have to spot you in the real competition before they can use their clever counterattacks. Keep it clean4, and no more than, say, ten characters long. Using someone else‘s matriculation number would be particularly sneaky, so is definitely banned.
  • Standard algorithms will be entered with reasonably descriptive names; you are of course welcome to enter one of those, since the relative popularity of an algorithm can alter how well it does.
  • You don’t have to use the algorithm you’re going to use in the real thing, and can change every week – in fact, that’s half the fun! But you should encourage others to be consistent, so that you’re chasing the right target. Tricky. I leave this to you.
  • Only one entrant per person per IIPD. If you’re playing, I’ll use the latest code you supplied by email prior to 5pm on the monday. If you decide later on not to play, that must also be communicated to me by 5pm of the appropriate week (but will persist into following weeks).
  • First tournament will be a week today (Monday January 20th); then Jan 27th, Feb 3rd, Feb 10th, with results in time for hand-in Feb 14th.
  • Above all, remember that the point is to have fun; if deliberate griefing makes this not-fun from my perspective, I’ll stop (but keep all your code).

That covers it for now. To prevent continual spam of the mailing list, I will be maintaining this page for further rule changes as I see fit, plus the tournament rankings as they arise. Individual logs will be distributed by email, so send me your code from the account you want the results to go to.

1 Or thereabouts, depending on unexpectedly good social life or unhelpfully punishing coursework deadlines.
3 If you believe this for one second, you are not yet ready for the cutthroat world of the IIIPD.
4 At least, not offensive in any languages I know how to be offensive in.

Leave a Reply

Your email address will not be published.