## 3B Remark 2: Guessing cards

Given a permutation $\pi \in S_n$, the associated shuffle of a deck of $n$ cards, numbered from $1$ up to $n$, sends the card in position $k$ of the deck to position $\pi(k)$. For example, the shuffles in Exercise 4 which always fix the top card, correspond to permutations $\pi$ such that $\pi(1) = 1$.

Let $P : S_n \rightarrow \mathbf{R}$ be a probability distribution. Imagine the following game: our opponent takes a fresh deck of cards (in which card number $k$ appears in position $k$) and chooses a permutation $\pi$ using the distribution $P$. He then shuffles the deck according to $\pi$, and turns over the cards one-by-one. We have to guess the number of each card, before it is shown.

A deterministic guessing strategy will, if we have seen cards $a_1, \ldots, a_{k-1}$, always makes a fixed guess $b \not \in \{a_1, \ldots, a_{k-1}\}$ for the card in position $k$. Of the $(n-k+1)!$ permutations $\pi$ satisfying $\pi(1) = a_1, \ldots, \pi(k-1) = a_{k-1}$, exactly $(n-k)!$ also satisfy $\pi(k) = b$. Therefore such strategies guess correctly on position $k$ in a proportion $1/(n-k+1)$ of shuffles. It follows that, if shuffles are selected uniformly at random, then the expected number of correct guesses is

$1/n + 1/(n-1) + \cdots + 1/3 + 1/2 + 1 = H_n$,

the $n$-th harmonic number. Note this is independent of our strategy; since we expect uniform shuffling to be unexploitable, this seems reasonable.

Let $f(\pi)$ be the number of correct guesses made if the cards are shuffled according to the permutation $\pi$. If $U$ is the uniform probability distribution on $S_n$, then the argument above shows that

$U(f) = \sum_{\pi \in S_n}U(\pi)f(\pi) = H_n$.

When we play the game above, our expected number of correct guesses is

$P(f) = \sum_{\pi \in S_n} P(\pi) f(\pi)$.

The difference $P(f) - U(f)$ measures the degree to which our strategy exploits the opponent’s departure from uniform shuffling. Let $A \subseteq S_n$ be the set of permutations which are preferred by $P$, i.e.

$A = \{ \pi \in S_n : P(\pi) > U(\pi) \}.$

Then

$P(f) - U(f) \le \sum_{\pi \in A} f(\pi) (P(\pi) - U(\pi))$.

Using the crude bound $f(\pi) \le n$ we get

$P(f) - U(f) \le n \sum_{\pi \in A} (P(\pi) - U(\pi)) = n (P(A) - U(A)).$

The variation distance $||P-U||$ is defined by

$||P-U|| = \mathrm{max}_{A \subseteq S_n} |P(A) - U(A)|.$

Hence $P(f) - U(f) \le n ||P - U||$, as claimed in the remark. So the variation distance gives a bound on the degree to which we can exploit the depature of $P$ from uniform randomness.

If we allow strategies that have a random component then the same bound holds, and the argument above goes through, provided we replace $f(\pi)$ with the expected number of correct guesses when the shuffle is $\pi$. It is worth noting that this expectation is taken with respect to the `internal’ randomness of our guessing algorithm. It has nothing to do with the probability distribution $P$. It would be interesting to know if there are cases where a random strategy can do better, on average, than a deterministic strategy.

One final example. Suppose that our opponent (wrongly) believes that a good shuffle should not fix any card. If $P$ is the probability distribution assigning equal probability to each derangement in $S_n$ then $||P-U|| \approx 1 - \mathrm{e}^{-1}$. It seems highly unlikely that any strategy can hope to average anywhere near to $(1-\mathrm{e}^{-1})n$ correct guesses against this opponent, so in this case at least, the bound we have proved is probably rather weak.