DEq: The Math#

To define DEq, we will ask “For a given card \(c\), what is the average marginal win rate attributable to choosing the card relative to picking it into the trash”. The last part of that condition might seem a bit strange. Why not “compared to the next alternative”? The reason is that the alternative depends on the context in which the card was observed to be chosen, whereas when we are choosing between two cards, we want a consistent baseline that enables us to make that comparison. When cards were observed to be chosen highly, a greater cost in fresh packs was paid compared to cards that were chosen later, and the core idea of DEq is to account for that difference, by crediting that cost, as measured by ATA, back to the imputed win rate.

The base formula for DEq is as follows, which I will then explain and derive in detail for those who have the appetite:

\[ \text{DEq Base} = \text{GP %}(c) \cdot \left( \text{GP WR}(c) - \mu + 3\% \cdot(1 - (\text{ATA}(c) - 1) / 14 )^2 \right). \]

I then further apply some relatively minor adjustments which I will detail below and are provisional anyway. DEq Base is perfectly suitable on its own as a card quality metric.

Pick Equity#

Let

\[\mu = P[\text{W}]\]

be the theoretical mean game win rate for a given player cohort (e.g. me, you, or 17Lands “Top” players) for a metagame (e.g. a particular set on a particular day for a particular rank), and let

\[\mu_k = P[\text{W} | N_k]\]

be the probability of winning conditioned on a “null pick” \(N_k\), meaning, picking a card straight into the trash, or replacing it with a basic land, for one \(k\)-th pick in one pack (i.e. averaged over the three packs). Then let

\[ r_k = \mu - \mu_k \]

be the “pick equity” of a \(k\)-th pick, that is, the marginal win rate attributable to making that pick, on average. When you choose a card with a \(k\)-th pick, on average, you are giving up \(r_k\) of winrate in exchange for adding that particular card to your pool.

A proper estimate of \(r_k\) is a piece of analysis that I intend to do thoroughly down the road, but for now I am relying on estimates based on previous analysis and a bit of eye test on the final results. I’ve found that a quadratic fit with P1 being worth 3% gives good results. Note that DEq is most useful in pack one, where your early picks are generally more valuable.

import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np

_, ax = plt.subplots()
x = np.arange(1, 15)
y = 0.03 * (1 - (x - 1) / 14) ** 2
ax.yaxis.set_major_formatter(mtick.PercentFormatter(1.0, decimals=1))
ax.plot(x, y)
ax.set_title("Model Pick Equity by Pick Num")
ax.set_ylabel("Pick Equity")
ax.set_xlabel("Pick Num")
ax.set_xticks(x)
plt.show()
_images/73c10b5c2dec2900bc1d7ca3eceac22d1bd13e36403d28f8b8837fa1cedd13f1.png

In practice, we apply this model to the observed ATA (average taken at), since that is the number we have available. While there is a bit of convexity, this shouldn’t trouble us too much. For a given card, in a given metagame, we identify this quantity as \(r_{\text{ATA}(c)}.\)

DEq Base#

We’re going to get technical because it will allow us to specify exactly the assumptions that create bias in the final model. The notation is a bit complicated, and could be improved, but sometimes that’s just the way it is. The final formula is relatively intuitive so you can skip to that. The essence of the below is that DEq is GP WR plus Pick Equity modified by GP %. We will have to define a bunch of events and rearrange some conditional probabilities to get there.

Recall that \(P\) represents a particular player/metagame probability model, and a conditional like \(P[X| A]\) effectively takes a weighted average of \(X\) restricted to the (theoretical) observations of \(A\) within that same model. \(P\) is “omniscient” in that it gives the true rate of outcomes all possible events. Finally, \(P\) is a measure on “pick events”, that is, observations of individual draft picks coupled with the resulting outcome.

Given a card \(c\), let \(C\) be the set of draft pick events \(\omega\) where \(c\) was selected.

Now when we define our own probability space, we can make the event space as detailed as we like, so we couple the elements \(\omega\) with all the information needed to construct counterfactual scenarios, and we will rely on \(P\) to average them correctly. So let \(\tilde{W}\) represent the counterfactual outcome under the null pick, and let

\[ q(c) = \frac{P[W(\omega) - \tilde{W}(\omega), \omega \in C]}{P[C]}. \]

This is the formal expression of our idea about marginal win rate. We properly ask, at the operative moment, “What happens if we don’t take the card instead?” So \(q(c)\) is what I call the “draft equity” of \(c\), and \(DEq\) will be the metric estimating \(q\).

It will be more convenient to express this using the null pick notation above, so we “split out” the counterfactual outcomes to new elements \(\omega\) to create events \(N_C\) representing the same scenario up to the pick event, which is then replaced with the null pick and followed to the counterfactual outcome. Then

\[ q(c) = P[W | C] - P[W | N_C]. \]

\(N_C\) may have zero measure under \(P\), but we can just stipulate that events are weighted according to \(C\) in the conditional measure and that should handle it.

Now the available metric that most nearly estimates \(P[W | C]\) is GP WR, “games-played win rate”, under the further condition that \(c\) is played in our deck. Let \(D = D(\omega)\) be the event that the card resulting from the specific pick event \(\omega\) is played in the deck, and write

\[\begin{split} \begin{align} P[W | C] &= P[DW | C] + P[D^c W | C] \\ &= P[ D | C] \cdot P[ W | C, D] + (1 - P[D]) \cdot P[ W | C, D^c],\\ \end{align} \end{split}\]

Now

\[\begin{split} \begin{align} q(c) = &P[D | C] \cdot (P[W | C, D] - P[W | N_C]) \\ &+ (1 - P[D | C]) \cdot (P[W | C, D^c] - P[W | N_C ]). \end{align} \end{split}\]

To break this down we need some assumptions, which we will write down.

  • Assumption 1: The quality (winrate) of \(\omega \in N_C\) is independent of the card \(c\), and only depends on the pick equity lost to the null pick. We should have good reason to believe this is not the case. We will look at a first attempt to adjust for the bias caused by this assumption in the next section.

  • Assumption 2: The quality of \(\omega \in C \cap D^c\) is independent of \(c\), and depends on the pick equity given up by not playing \(c\). Again, this shouldn’t be strictly true. We can examine the effect of this assumption by analyzing the as-picked win rate, which is available in the public data sets.

For fixed \(k\), let \(C_k \subset C\) be those k-th picks where \(c\) was chosen, and using assumption 1 we have

\[\begin{split} \begin{align} P[W | N_{C_k}] &= \frac{P[W, C_k| N_k]}{P[C_k| N_k]} \\ &= \frac{P[W | N_k] P[C_k | N_k ]}{P[C_k | N_k]} \\ &= P[W | N_k], \end{align} \end{split}\]

Therefore,

\[\begin{split} \begin{align} P[W | N_C] &= \frac{\sum_k P[C_k] \cdot P[ W | N_C, C_k]}{P[C]} \\ &= \frac{\sum_k P[C_k] \cdot P[ W | N_k]}{P[C]} \\ &= \frac{\sum_k P[C_k]\mu_k }{P[C]} \\ &\sim \mu_{\sum_k P[k C_k| C]} \\ &= \mu_{\text{ATA}(c)}, \end{align} \end{split}\]

Jensen’s inequality standing in for the \(\sim\), which should be de minimis and is an assumed curve anyway. By the same calculation, \(P[W | C, D^c] = \mu_{\text{ATA}(c)}\) given assumption 2 and the second term of \(q(c)\) is zero.

Now, by design, we have estimators \(\text{GP %}(c)\) for \(P[D | C]\), and \(\text{GP WR(c)}\) for \(P[W | C, D]\), along with \(\text{ATA}(c)\) for \(P[k C_k | C]\). There should be no problems there, except to note that in general 17Lands stats come with a particular distribution of player preference and metagame sample, and we should keep in mind how that variation affects the representation of our particular situation. To start with, I recommend taking the “Top” player sample and the narrowest practical range of dates with a decent sample size. We’ll look at one attempt to account for some of the remaining misrepresentation bias below.

Performing the substitutions, our formula is

\[\begin{split} \begin{align} \text{DEq Base}(c) &= P[D | C] \cdot (P[W | C, D] - P[W | N_C]) \\ &= \text{GP %}(c) \cdot \left( \text{GP WR}(c) - \mu_{\text{ATA}(c)}\right) \\ &= \text{GP %}(c) \cdot \left( \text{GP WR}(c) - \mu + 3\% \cdot(1 - (\text{ATA}(c) - 1) / 14 )^2 \right). \end{align} \end{split}\]

This is the base formula I use in practice, so I call this quantity “DEq Base”, and apply a few adjustments to provide the final estimate of DEq.

Bias Adjustment#

… WIP …

Metagame Adjustment#

In practice, the data available to us is drawn from a different metagame than the one we wish to apply it to. Most applicably, the data on 17Lands.com is drawn from some set of days (which we can control) prior to the current date, the date on which we are intending to make good draft picks. The metagame evolves predictably in certain ways as outperforming strategies tend to regress to the mean in win rate over time. I won’t get into all the details of how I arrived at these parameters, but I do a correction to DEq based on some assumptions about metagame regression and how that affects the DEq of individual cards.

In particular, assume that the marginal win rate of each archetype (as defined above) tends to degrade by some factor \(\lambda\) each day, and that the number of games played in the overall data set degrades by a factor of \(\gamma\) each day. Let \(T\) be the number of days that the data set spans. Given these three parameters and the observed marginal win rate for the archetype, we can estimate how much the win rate for the archetype has degraded from the observed data to the present day. In particular, if \(\nu_0\) represents the initial marginal win rate of the strategy, and \(\bar{\nu}\) represents the observed value, we assume

\[\bar{\nu} = \frac{\sum_{j=0}^{T-1} { \gamma ^ j\lambda ^ j\nu_0 }}{\sum_{j=0}^{T-1}{\gamma ^j}},\]

and we can use the analytical value of the sum,

\[ \sum_{j=0}^{T-1} x^j = \frac{1 - x^T}{1 - x}, \]

to solve for \(v_0\), obtaining

\[ \nu_0 = \bar{\nu} \cdot \frac{1 - \gamma}{1 - \gamma^T} \cdot \frac{1 - ( \gamma\lambda)^T}{1 - \gamma\lambda}. \]

Then our assumption for the marginal win rate lost by the archetype relative to the average measured value is

\[ \bar{\nu} - \lambda ^ T \cdot \nu_0 = \bar{\nu} \cdot \left(1 - \lambda ^ T \cdot \frac{1 - \gamma}{1 - \gamma^T} \cdot \frac{1 - ( \gamma\lambda)^T}{1 - \gamma\lambda}\right).\]

Inherent in the modeling of DEq is an assumption of trade-off between pick order and win rate. As a card is picked higher, it will shed some win rate due to the cost of its higher placement. In an otherwise static environment, we should expect the effect to be neutral on DEq, but in a changing environment that is not the case, as the value of drafting a card is affected by the quality of the deck you expect to play it in. Based on previous observations, we expect a card to lose about 60% of the DEq of the card attributable to the archetype win rate, which we model as the archetype marginal win rate less the bias adjustment, since that is already subtracted off. In reality, a higher value should be used for more linear cards where the additive value of the card itself is dependent on the potency of a typical deck in the archtype. For example, Harvestrite Host was heavily overrated by DEq midformat in BLB (to my personal detriment). So the final archetype bias adjustment is given by

\[\begin{split} \begin{align} \text{Meta Adj}(c) = &0.6 \cdot \left(\text{GP WR}(\text{color}(c)) - \mu + \text{Bias Adj}(c) \right) \\ &\cdot \left(\lambda ^ T \cdot \frac{1 - \gamma}{1 - \gamma^T} \cdot \frac{1 - ( \gamma\lambda)^T}{1 - \gamma\lambda}- 1\right), \end{align}\end{split}\]

and we finally have

\[ \text{DEq}(c) = \text{DEq Base}(c) + \text{GP %}(c) \cdot(\text{Bias Adj}(c) + \text{Meta Adj}(c)).\]

You can also project \(t\) days into the future by using a factor of \((1 - \lambda ^ {T + t})\) in the formula replacing \((1 - \lambda^T)\). In practice I use \(\lambda = 0.95\) and \(\gamma = 0.95\). As a final note, in our Spells analysis of DEq, we are generally applying the metric in a backward-looking fashion and therefore do not use the metagame adjustment.

Final Thoughts#

DEq is a work in progress, and as a result the formulas above will change over time. The core of DEq is the adjustment of observed card win rates (GP WR or as-picked WR) based on a notion of pick equity, which will necessarily be a function of ATA. Any metric I call DEq will always have that core formulation, but the estimate of pick equity, the bias adjustment, and metagame adjustment, as well as additional factors, could all change. Additionally, it will always be a constraint that DEq be derivable from daily data accessible on the web, however that availability evolves (hopefully for the better). And of course, the point of the whole thing is to provide a useful ranking of card quality to use for draft picks. I hope this write up gives you greater confidence that DEq is suitable for that purpose.

Changelog#

The history of this document is available on my github.

  • 2025-01-16 Initial publication

  • 2025-01-17 Corrected assumptions text and refactored explanation, added formula

  • 2025-01-22 Replaced “misrep” chart with entropy loss

  • 2025-08-21 WIP Updating Docs