Guidelines for Writing Good Judge Questions

Questions about rules and policy are a staple of the judge program. But what really goes into writing them? Here are some good practices to keep in mind. These are just general guidelines- they apply to both casual practice questions and serious test questions.

Use player names that start with "A" for the active player and "N" for the nonactive player. Knowing whose turn it is isn't relevant for every question, but it often is, and being consistent with this naming convention helps people to remember for those questions where it does matter. Do keep in mind that not everyone is familiar with this convention, so it's a good idea to state it explicitly if the questions are intended for players or less experienced judge candidates. (And this can be generalized. "J" names for a judge, "S" names for a spectator, etc.)

Don't use player names that are also card names. Yes, I know it's "cute" to call your players "Ajani" and "Nissa", but the people reading the question won't think so. Doing this is adding additional cognitive load to someone trying to parse through and answer the question. The first time they see the "player" name, they'll add it to their mental model of the battlefield only to have to remove it when they realize it's not actually a card. Subsequently, every time they see that player name, it'll cause a jolt of confusion since they recognize it as a card name but it's not a part of their mental model. This is a fairly minor effect of course, but it has zero benefit and nonzero downside. Real players at an event are not named Ajani, so it's not something that judges need to get used to.
Ask about information people need to know. Not every question you can ask that's tangentially related to Magic is actually helpful. For example, if you're writing practice questions about layers, the question

"What rule tells us that an object receives a new timestamp when it enters a zone?"

is unhelpful. Should judges know that objects get a timestamp when they enter a zone? Yes, absolutely. Do they need to know the rule number that says that? No, they do not. (It's 613.7d, and I'm going to forget that as soon as I move on to the next paragraph.)

This is an especially common problem when the questions directly follow having taught that content, as it's quite tempting to just ask questions about the teaching method rather than the content itself. For example while teaching the steps to cast a spell, someone might use the mnemonic "All Crazy Teens Have Tried Magic Pills". Then it's easy to write a question where the 4 options are different mnemonics and the judge is asked which one was used in the lesson. This is bad design. If someone uses a different method to remember the steps to casting a spell, this question is going to punish them for not knowing completely irrelevant information, despite the fact that they do know how to cast a spell.

One exception to this is if you're trying to test someone's ability to search the official documents. In that case, asking which rule says that an object receives a new timestamp when it enters a zone isn't testing whether the test-taker has it memorized, but testing whether they can efficiently search the CR for that rule. This is a reasonable thing to test for.
Don't use interactions that judges might have specifically memorized. There are some card interactions that get bandied about as common examples of a certain area of the rules. Lightning Bolt + Tarmogoyf is a great example of the order of state-based actions and spell resolution timing, but it's used so frequently that a lot of judges will simply have memorized the answer of "Tarmogoyf lives" and may not understand why. This makes it a bad evaluation question, since a correct answer won't necessarily indicate understanding of the material. Instead, ask about the same interaction using different cards. "A 1/1 Lord of Extinction is targeted by Instill Infection. Does it die?" is pretty much the exact same question, but doesn't allow judges to skip the "figure out what's actually being asked" step. (Of course some judges might recognize that this is the same as the Lightning Bolt + Tarmogoyf interaction and jump to that answer, but they've still demonstrated the ability to consider the scenario and identify that it's a similar interaction.)

This is also a common problem when the question comes after having taught that topic. The teaching will probably have involved examples, and it's quick and easy to ask about those examples in the following questions. But this encourages people to just memorize the examples that were given earlier rather than understand the concepts. Don't reuse questions.
Ask open-ended questions. Don't guide people to an answer. Now, how exactly you ask the question is going to depend on the usage. If it's a test, you probably need to provide multiple-choice answers. If it's a practice question that isn't getting scored, you have more freedom. But regardless, you should avoid artificially narrowing down the potential answers when possible. If the question is about the amount of damage dealt, just ask "how much damage is dealt" rather than "Is NAP dealt 3 or 5 damage?". (Even if it's a test, you can just have a number input rather than a few select choices.)

If possible you should also be asking for an explanation of why the answer is what it is. On a test this usually won't be feasible, but for a study group or mentoring session, it's great. First off, this helps make sure the mentee actually understands the underlying rules rather than just having made a lucky guess. Secondly, it tests their ability to explain the interaction to a curious player. Even if you know the rules, if you aren't able to explain them clearly to a player who wants to know, you're going to have more trouble with your calls.
Be clear and concise, but avoid being too technical. The exact tone you want to strike in your questions will depend on your audience. Regardless of the audience however, it's important that all relevant information is provided without bogging down the question with extra verbiage or overly-technical wording. Consider the following question:

"Palace Jailer's trigger gets stifled. Will the targeted creature be exiled?"

This question is missing a lot of important information. Who was the monarch before Palace Jailer was played? Which trigger was stifled? Was it stifled before or after the other trigger resolved? What does "stifled" even mean to someone who doesn't play legacy? Not all of these pieces of information are needed in order to answer correctly, but many of them are. And even the ones that aren't needed still seem like they could be, so leaving them out is providing a strong hint to the answerer that wouldn't be given in a real-world situation. Ok, let's make this more precise:

"NAP is currently the monarch and controls Grizzly Bears and no other creatures. In AP's precombat main phase, they cast Palace Jailer. After it enters the battlefield and after state-based actions have been checked, AP chooses to target Grizzly Bears with Palace Jailer's second trigger of 'When Palace Jailer enters the battlefield, exile target creature an opponent controls until an opponent becomes the monarch.' AP chooses to have the order of the triggers on the stack be 'When Palace Jailer enters the battlefield, you become the monarch.' on top and 'When Palace Jailer enters the battlefield, exile target creature an opponent controls until an opponent becomes the monarch.' second from the top. After this, AP passes priority. When NAP receives priority, NAP casts Stifle targeting Palace Jailer's first trigger and then passes priority. After this, both players continue to pass priority without taking any actions until all spells and abilities have resolved and state-based actions have been checked after the last resolution. At that point, is Grizzly Bears in exile or on the battlefield?"

Well that's just miserable. Nobody wants to read that. I feel disgusted to have even written it. A good middle ground for a casual environment would be something like this:

"I'm the monarch and I control Grizzly Bears. You cast Palace Jailer and target Grizzly Bears. I counter the trigger that's trying to make you the monarch. After everything has resolved, is my Grizzly Bears in exile?"

While a reasonable wording for a more technical environment might be more like this:

"Nancy is the monarch and controls Grizzly Bears. Arnold casts Palace Jailer and puts its first trigger onto the stack above the second trigger, which is targeting Grizzly Bears. Nancy casts Stifle targeting Palace Jailer's first trigger. After all spells and abilities have resolved and state-based actions have been checked, is Grizzly Bears in exile?"

Regardless of the environment though, use technical terms rather than slang. Even if everyone understands what you mean when you say that Delver of Secrets "flips", it's best to help people get into the habit of using technically correct terms for those situations where it does matter.
Keep questions as simple as is appropriate for the difficulty level. If a vanilla creature is all that's needed, use a vanilla or french vanilla creature rather than a creature with long abilities that aren't relevant to the question. If there's a standard-legal or otherwise well known card that serves your purpose, use that rather than an older, more obscure version of it. It's ok to vary things a little bit (which helps not give away information when you suddenly use a creature with an ability), but having to hold what an unfamiliar card does in your head is an additional cognitive load that we should avoid imposing when possible. (Guideline #4 is an exception to this for the case where the full interaction might be memorized. Having just what the card does memorized isn't a bad thing, it's only when that memorization starts to apply to interactions does it become a problem.) If your goal is to evaluate a more advanced judge's ability to parse a complicated board state, then maybe adding extraneous cards or text in some questions is appropriate. But in general, don't add unnecessary distractions.

This doesn't mean that you can't make the question more complex in order to test more knowledge. If you have a question involving several cards that requires knowing about both layers and replacement effects, you don't have to split that apart into two separate questions, since all of its components are still doing something meaningful. Just remember that more complicated questions should be saved for more advanced judges, and make sure every card is there for a reason.
Try to avoid having the players do something strategically poor. Our job as judges is just to evaluate the legality of a play, not whether it was a good idea. However it's easier to intuitively understand what's being asked when the sequence of events makes sense. If the question has a player killing their own creature for no reason, it can make it harder to grok the scenario. If the interaction in question is hard to make happen with players who make good decisions then it's ok to make an exception, but that may also be a sign that the interaction in question isn't all that relevant to tournament play and doesn't need to be tested for.
Make sure the answer that's marked as "correct" isn't just "the answer you were thinking of". Nothing is more demoralizing when answering a question than giving an answer that's correct but being told it's wrong simply because the person writing the question hadn't considered it. Similarly, make sure your answers aren't so vague that even someone with a good understanding of the topic won't know which one is intended to be the "right answer". Consider this question:

"At a Competitive Rules Enforcement Level tournament, a player accidentally shuffles their hand into their library. What is the appropriate response?"

With the following multiple choice options:
- Have the opponent pick that many cards from the library and that becomes the player's new hand.
- Reconstruct the hand as best as you can from the information available to you.
- Issue the player a game loss.
- Ask the players to wait and go ask another judge for assistance.
The world's best judge couldn't pick one "correct" answer out of those. All four answers are potentially correct, depending on details of the scenario that weren't specified in the question. Or how about this one:

"What is the most important skill for a level 3 judge to possess?"
- Investigation skills.
- Teamwork, diplomacy, and maturity.
- Logistics knowledge.
- The ability to head judge a large event.
- Interpersonal skills.
This one is even worse. Firstly, "most important" isn't a factual statement. It's just a matter of opinion, and different judges are going to have different opinions about the one that's the most important. Secondly, these answers overlap. Answer 2 is a subset of answer 5; being good at "interpersonal skills" is strictly better than being good at only "teamwork, diplomacy, and maturity", since "interpersonal skills" covers more areas. Similarly, "the ability to head judge a large event" requires all of the other skills. So even if "teamwork, diplomacy, and maturity" were unambiguously the most important skill out of the eight L3 qualities, it would be impossible to tell whether answer #2, #4, or #5 would be the one marked as "correct".

Questions like these leave the reader bewildered, because rather than being able to focus on figuring out the right answer, they're left trying to guess what the writer was thinking when they wrote the question. This happens most often with multiple-choice, but also applies to freeform questions. Just because an answer isn't what you were expecting doesn't make it wrong.
Solicit feedback. Everyone can make mistakes, and question-writing is no different. Ask another judge to look over your questions before you're done. If they're a judge who knows the content but still has trouble picking out the "correct" answer, that's a sign that your questions are probably too vague. If they have trouble parsing what's going on in the question, that might mean that it's too complicated. Listen the feedback you get and act on it accordingly.

There are many other good practices that are more specific to the exact type of question and where it's being usedFor example, trick questions and workshop questions., but these are a good starting point. Just remember that the goal of a question is to evaluate how well a judge would handle a similar scenario on the floor of an event, not to evaluate some arbitrary "test-taking ability". Of course it'll never be exactly the same, but try to provide the same sort of information they'll have available to them in a real call, and don't try to coach them or trick them.