Game Week 3 Review: Game Theory, Guild Strikes, and Governors of Goa
Also, how to really (not) read a Mimir scorecard.
It's Thursday folks, the absolute last day one can put out an analysis post for a quiz that happened the week before, without it becoming a total sham.
Someone said some really nice things to me today about these quiz review posts, so I've suddenly got a lot of energy to write this. Let's go go go.
The data from Game Week 3 threw up some interesting stuff about seat distribution, and this gives me a chance to do a quick digression.
Update: Having written it out, it is not quick at all, it is quite indulgent. If you have little interest in reading a Mimir scorecard and are only here for the quads and muskets, jump ahead to Stats from the Week.
The Blind Non-Science of Seat Balancing
This is what a typical Mimir scorecard looks like.
This one isn't from a league game, it's from a friendly set played as part of the B612 Friendlies Community1.
A quick explainer of the scorecard:
Pts - Points scored in the quiz
BAs - Bonus Attempts, or number of times the player attempted to answer a passed question.
Pos - Position, or rank in the quiz
Own - The number of 'direct' questions answered correctly. A typical Mimir quiz has 152 direct questions for each seat.
BPts - Bonus Points, or the points accrued from passed questions. So this is always less than BAs.
Xs - Questions that went unanswered in the game
rPOX - A summary of the ongoing round, Points Owns and Xs
pPOX - A summary of the previous round, Points Owns and Xs
If you've been doing Mimir quizzes for a while, you've probably heard people treat the Xs column as a rough indicator of how balanced the set is between different seats. People take their quiz results quite seriously, and some get extremely annoyed by any indication of having been subjected to 'the tough seat'. Some go further and use this data as a reason to be rude to setters, readers and their fellow players, but that's a larger problem that spans the world of quizzing. Dicks abound, and we will ignore them for now.
The use of average Xs from a large sample set of games to check seat balancing may have some merit, but data from a single game can very frequently be completely useless or even misleading. Let's see why. Back to our scorecard.
First up, some disclaimers:
This game was a 'friendly'. You can tell from the low Bonus ratios, with Arnold and Nayan scoring only 2/9 and 3/7, and Utkarsh going absolutely berserk with 20 guesses on passed questions. These numbers are unlikely to show up in a league game. These guys were playing for fun, without necessarily thinking too much about winning or losing.
Since it's a friendly, the group itself might be unbalanced too. While they're all great quizzers in their own topics, in an average general quiz there's a chance that Arnold and Utkarsh might do a little better than Vivek and Nayan. Again, they're playing for fun here. In a league game, games tend to be more competitive especially later in the season, since groups are made on the basis of past performances and aren't completely random.
Finally, this is a topical set. It's Ankit Vohra's India-themed set. Very much in Utkarsh's wheelhouse, and not really in anyone else's. Naturally, this affects the scores.
First, let's look at the Xs. Utkarsh's seat 4 saw 9 unanswered questions, while Vivek and Arnold had only 5 each. On the surface that makes it seem as if Seat 4 is the tough seat, while Seat 1 and 2 are easier.
In fact, in this game, Utkarsh almost scored a 'no-hitter' as I've heard some people describe it (a baseball reference I don't fully understand). This means he answered 5 of his directs, and of the remaining 10 that he missed, 9 went unanswered! So there was only 1 direct question among Seat 4's directs that Utkarsh couldn’t answer, but someone else in the group could.
In reality, a game like this is heavily affected by the presence of someone who is, well, really really good. If you were in this game, your own confidence in your guesses would be swayed if Utkarsh didn't get the answer right. Self-doubt creeps in, and you might pass on something you might otherwise have taken a shot at.
For this reason, if someone's got a sizeable lead in a Mimir, their Xs are likely to be higher too, since other players are reluctant to waste a BA on something that's "too tough for the stalwart". This effect would be even more pronounced in a league game, where players are incentivized to play for 2nd or 3rd positions.
It doesn't end there. Utkarsh's expertise in Indian topics doesn't just push up his own Xs, it also actively pulls down everyone else's. Plenty of questions that would've gone unanswered in a group consisting of Vivek, Arnold, Nayan and, say, me, were quickly swept up by Utkarsh at the end of the passing order. He's got 14 Bonus Points for a reason, those are all questions that would've added to the Xs for other seats if he wasn't playing.
This is also why you've probably sometimes thought during a quiz that your seat was harder than the others ("I know everyone's directs but not my own"), but were annoyed to find that the Xs are still balanced.
Okay, so does that mean that Xs are an indicator of who's doing particularly well on the quiz? Well, no, it's only a trend at best. A dominant performance only tends to rack up Xs and snatch them from others, but this Stalwart Effect is only half the story. If a particularly weak player were to replace Utkarsh in Seat 4, you'd likely see a higher number of Xs there too. Why? Because Mimir quizzes punish BAs by pushing them to the back of the passing order. So a high-risk answer can sometimes only be attempted by the player that it is directed to. If that player passes, all other players would rather pass as well than risk another BA. You can try this out for yourself. Run a Mimir quiz for 3 people and keep one seat empty (happens in league games when someone ditches at the last second, we've all been there). Watch as the Xs rack up.
Alright Harman, so what are you saying, Xs are high if the player is having a good day, and also if the player is having a bad day. So Xs are just always high?
Of course not. Xs are high in those two cases, and as an observer, you're never really going to know which one of the two it is. It's a guess at best, and if it's based on one game, then it's probably wrong.
Here's another scorecard, FOR THE SAME QUIZ.
Suddenly Seat 4 is the easiest one, and Seats 1 and 2 are way harder? Nah. The only real insight from this scorecard is that Subrat is a f*cking tank. Everything else is bull.
What about Owns, are those indicative of seat difficulty?
Unfortunately, no. Like Xs, Owns of each seat are affected by the strength of the player occupying it, but unlike Xs, they aren't affected by other players. So Owns are a great way to measure the objective strength of players....assuming all seats are balanced. But that's a variable too! Now what?
In league games, or at least in later-stage league games where groups usually contain players of roughly equal strength, Owns can give you a rough idea of seat difficulty.
But the number of other factors affecting this is still frankly scandalous. International players are hit by 2 India quads per week, and everyone benefits from the occasional "quad on a topic I like". Many a game later in a season can still play very lopsided, and that doesn't just mean the seat balancing was terrible. Every quizzer has a bad day. In fact, let's be honest, most quizzers have 3+ bad days a week, or at least bad by their definition. The topic distribution for that week, the Wikipedia pages you browsed in the days leading up to the quiz, and the fight you just had with your spouse just before joining the zoom call - all of these things have a bearing on the Owns and Xs. Good luck sorting that out.
In summary:
Don't conclude anything from the stats of a single game. You are wrong.
Don't conclude anything from the stats of games held early in a season. You are wrong.
Don't conclude too much from Xs or Owns, even as averages of many games. You might still be wrong.
Quiz league setters and editors need to do this balancing dance every week, and it always amazes me that they're able to get it 'right' so often. The first game of the week normally gets a lot of setters spectating, because we learn a lot from watching people attempt our questions in the context of an actual quiz. There's no knowing how the ordering or phrasing or reading aloud will affect things, and we want to know whether we should make any last-second chances. If you've written a friendly and see an unbalanced X ratio after one game, I urge you not to make any hasty edits. Let it be. They're quizzers, they'll deal with it.
Stats from the Week
Phew. All that for what? Well, something very interesting happened this week. The average Owns and Xs per seat were unremarkable, but holy shit look at that Musket count.
A musket is when a player answers all 4 questions on a topic, i.e. their own direct, as well as 3 passed questions. It's a rare achievement for most quizzers, since it's a combination of both skill and luck. You need to know the 4 answers, but you also need the questions to pass to you so you can answer them.
So how the hell did Seat 1 throw up 6 muskets?
Muskets are affected by more than just seat balancing. Often the only thing stopping you from scoring one is when another player gets a particularly easy direct question on the topic. So, for the highest chance of scoring a musket, you need to be on the seat with the easiest question in that quad.
3 of the 6 muskets on Seat 1 were for the same quad: Iraqi Governates. And Seat 1 had the L1 question for that quad too, Babylon. This probably contributed to the high count, although 3 out of 6 really isn't that high. If the ratio was 3:0:1:1 that would still be high, and none of the other musket quads on that seat had easy L1s. Oh well, data can only take you so far.
Enough numbers, let's look at these quads.
1. Sensitising Roald Dahl
That's right, Roald Dahl turned out to be a monster. Oompa Loompas were originally described as the black natives of a country in Africa, from where they were shipped by their gracious master "for their own good".
The L2 question played slightly easier, but it was also the last-appearing question in the quad, at which point you're probably trying to recall all the Roald Dahl characters you know and Oompa Loompas fit better than ever. But yes we made the surname mistake again. Dahl isn't an L1 just cos he's Dahl, he should've been at L2.
I'm not entirely sure why Kipling/Conrad played as easy as it did. I certainly wouldn't have been able to answer it, but 44% of player did.
🎯 Priyamvadha Shivaji scores a musket, or a perfect 4/4 score in this quad!
2. The Science of Sleeping
After that whole spiel about how all data is useless, you'd think I'd place less weight on these numbers too, but I still love a smooth(ish) gradient when I see it.
3. Yeh Tara, Woh Tara...
A message from setter Utkarsh Rastogi:
My daughter, Tara, turns 1 this week. I played a B612 quiz on the day she was born. Now that I am one of the setters, I thought of celebrating her birthday by having a quad on Tara (meaning 'star'). Here it goes. Only this year, next year she will be a participant.
By a country mile the cutest quad we've ever had, but that didn't keep us from some major slip-ups in levelling. Guanyin/Kannon was far and away the hardest question, but we were expecting Tumbi to play even harder. Eventually, the alternate answer (Tumba) might have affected this, since the extra 'coastal village sending things to the stars' points quite directly to Thumba.
Which single-stringed (Ek Tara in Hindi) folk instrument from Punjab region is a staple of Bhangra in Western music spheres? Featuring in Missy Eliot's Get Ur Freak On, its most popular usage is at the start of this UK Bhangra hit. Its name is phonetically similar to a coastal village in South India known for sending things to the stars (literally *tara*)
Answer: Tumbi, or Tumba
There was no reason at all for the sound clip in this question to be 30 seconds long, but we weren't going to miss an opportunity to play this in the middle of a quiz. If you read a quiz this week, you got to watch some people break into half-conscious head-bobbing. Always fun.
I thought the L3 was a fantastic question too.
In Irish history and mythology, the Hill of Tara (Irish: Cnoch na Teamhrach) is identified as the inauguration place and the seat of the High Kings of Ireland. This probably explains the choice of the name Tara as the home of which fictional family of Irish descent in a 1936 novel?
Answer: O’Hara
Here's an extra question for you:
Tara Deshpande was an Indian actress who acted in a few indie films such as Is Raat Ki Subah Nahin (1996) and __(6) Boys (1998) where the first word is the earlier name of an Indian city. While this latter film flew under the radar, what became highly popular was the song __(7) written and sung by Javed Jaffrey. What is the name of the song which was a play on the new name of the same city and the protagonist's brush with an underworld don there?
Answer: MUMBHAI from Bombay Boys
🎂 Happy Birthday Tara! Can’t wait for you to grow up and read this and be like “Sheh, Papa is so lame, my god”
4. ASI's Mohenjo-Daro Excavation
I loved this quad, even though I scored 1/4 in it myself, simply because it gave me names for the figures I'd seen in my school textbooks years and years ago. Dancing Girl and Priest-King are fairly memorable names with good clues, so I hope this'll come in handy someday.
Fun fact: The Priest-King sculpture was returned to Pakistan only in 1972 after the Shimla Agreement. According to one story, Indira Gandhi outright refused to return both the Dancing Girl and Priest-King, and asked Pakistan President Zulfiqar Ali Bhutto to choose! He picked the larger one, and the Dancing Girl is still on display in the National Museum in Delhi.
@B612ers in Delhi, if you visit the National Museum any time soon, could you send us a selfie? Thanks.
5. Iraqi Governorates X History
Another smooth gradient, this time in a fairly challenging quad. Babylon was straightforward thanks to the mention of Babil in the question, and Saladdin is a low-risk guess. But Karbala and Ninawa are just names you'd need to know, so well known for pulling those out.
🎯 Three players picked up muskets in this quad. Congratulations to Pranjal Agrawal, Akshay Gurumoorthi, and Eric Mukherjee!
6. Pavarotti & Friends
It's always fun to throw some music into these quizzes. Makes for a nice break for the readers, and the players usually enjoy themselves too.
The other questions played alright, and I thought Pavarotti makes huge improvements to whatever the Spice Girls were singing. But we were caught completely off-guard by how many valid guesses exist for ID-ing Bryan Adams' voice. I heard Steven Tyler, Bono, Bruce Springsteen, and many others, all of which were really quite reasonable. Like all good questions though, this one becomes extremely obvious once you've been shown the answer, and it seems hard to believe that anyone can miss Bryan Adam's voice. The mistake was simply that the sample set was a little too large, and a few extra hints in the question would certainly have helped.
Okay now watch him again.
🎯 Abel scored the only musket in this quad, recognising all 4 of the voices!
7. Viceroys & Governors of Portuguese India
Oof, time for a major levelling problem again. Our history-loving setting team thought Afonso de Albuquerque would be a familiar name to most players, but it was only answered by a quarter of those who attempted it.
In contrast, the quad intro to Portuguese Goa and the distinction of Bernardo Peres de Silva being the only native Viceroy was answered correctly by practically everybody.
At least the L4 was on point.
8. TV Shows Affected by the Writers Guild Strike
A better gradient, but still not smooth!
FUBAR pulled the old question-setter trick of dropping the expanded form of the acronym answer into the question, but it was far easier for people to add 20 to That 70s Show, even those who'd never heard of the sequel.
Sandman is a good guess for folks who were able to recognise Neil Gaiman, but Good Omens was another and that probably pulled the correct percentage down a bit.
I thought Daredevil: Born Again was a pretty solid quiz question, and it played well for an L4. The comic character is famously religious and the new show is a reboot, so the subtitle works in many ways.
9. Parquet Flooring Styles
A fascinatingly specific quad topic, and it played alright in terms of difficulty.
Most people hadn't heard of the phrase Underwater Basket Weaving, so that hint actively dissuaded some from guessing Basket.
10. Barbiecore Aesthetic
If you're sick of seeing Barbie pop up again and again in league quizzes, I'm with you. Hopefully, this is the last one we (and other leagues) will do but you never really know.
A smooth difficulty gradient meant that most people got Dopamine Dressing (nothing works as well as the ol' 'alliterative' clue), Pink Drink had enough clues in the question, and Harajuku Barbie was a nice inversion of the old "Whose alter ego" chestnut for Nicki Minaj. Valentino played tough, since Louboutin was a valid guess too.
11. Busting Myths about Athlete Logos
An entire quad was born from discussions between the setters about logos that people think were based on players (I say discussion but you know it came about when one person corrected another).
AB Devilliers was MUCH harder than we expected, and the "hit shots to every bit of the ground's circumference" hint just wasn't enough to make people think of Mr 360.
It should come as no surprise that Harmon Killebrew Jr was the toughest question this week. Any non-Americans who answered this can officially make a living off of quizzing.
12. The Four Components of Fines Herbes
Can you imagine the excitement of the question setter, aimlessly browsing the web, coming across the French fines herbs and realising that there are exactly four of them? Four is a sacred number amongst Mimir setters and naturally, this resulted in a quad.
I'm not completely sure why Tarragon played as hard as it did, given how similar the name is to the word dragon, which appeared in the question. I can only assume tarragon isn't as widely known here in India.
Incidentally, I enjoyed this question so much I ended up asking it in a pub quiz I was conducting in Amsterdam last Tuesday. The question was simplified to "Which herb gets its name because the leaves resemble the scales of a dragon?". One of the teams came over to the bar to ask if they could put down the answer in Dutch, which I allowed because it only seemed fair. They did, and I ran Google Translate to check.
Well.
That's right, the Dutch word for tarragon is incidentally just dragon, although it is pronounced differently. Dutch folks pronounce the letter g like they're trying to assault someone, so it's closer to dra-khon.
🎯 Aishwarya Subramanian and Vinay both answered all 4 of these questions, a perfect musket!
13. A Primer on Game Theory
We're finally entering non-ascending territory, which in a week like this one means I can mostly just ignore the Correct rate.
Payoff Matrix needed better cluing in the question the get the players there, especially given that it was the first question of the quad. All other questions played evenly.
14. Drinking Establishments in the USA
I loved this quad because all 4 of them made for very interesting fundae.
Sharing two of my favourites:
What term is given to drinking establishments scattered around America that are characterised by cheap $2 drinks, minimal decor, local clientele, and typically cash-only service with minimal food? The use of such a term for disreputable places comes from the fact that they were typically in basements, requiring the customers to "[blank] below".
Answer: Dive Bar
What musical term is given to a bar that plays country music for the patrons' enjoyment while downing a drink, typically in the Southern US belt? The name comes from a variety of tack piano that is used to produce the music, typically out-of-tune and producing the titular sound.
Answer: Honky Tonk
🎯 Of course Santosh JS scored a musket in the pub quad. Of course.
15. Arabic Influence on Spanish
Another very interesting topic, and it played evenly aside from Trafalgar, which could've used more hints.
The first question is probably my favourite from the whole set:
Arabic influence on Spanish, seen in several loanwords and derivations, mostly comes from the Muslim rule in the Iberian Peninsula between 711 and 1492. What in-the-news Spanish last name is probably derived from the Arabic for "the cherry", or through a municipality named for cherry trees, as evident from the Arabic prefix for "the"? This makes it etymologically related to the word 'cerise'.
Answer: Alcaraz
And that’s all we’re doing in this week’s round-up. See you next week!
16 if you're playing FLQL or ZQL.
Swashbuckling as ever!