3D Exercise 13: Shuffling by swaps

Suppose that we shuffle a pack of n cards by choosing uniformly at random two numbers from \{1,2,\ldots,n\}. If the numbers are the same, we do nothing; otherwise we swap the cards in the indicated positions. The corresponding probability distribution on the symmetric group S_n is defined by P(1_{S_n}) = 1/n and P(t) = 2/n^2 if t is a transposition. If we perform the shuffle k times, then the probability that the cards are permuted according to the permutation \sigma \in S_n is P^{\star k}(\sigma). Here P^{\star k} is the k-th convolution of P, as defined by Q^{\star 1} = Q and

P^{\star k}(\sigma) = \sum_{\pi \in S_n} P^{\star (k-1)}(\sigma\pi^{-1})  P(\pi)

for each k \ge 2.

The first part of Exercise 13 in Diaconis’ book outlines a proof that if b > 0 and k = \frac{n}{2} \log n - bn where b > 0 then the total variation distance between P^{\star k} and the uniform distribution U is non-negligible. The proof depends on looking at the number of fixed points F of a permutation. Since F-1 is the character of an irreducible representation of S_n, it is possible to compute its expected value and variance under P^{\star k} and U using representation theory.

For a full solution to the exercise, see here. In outline, one shows that

\mathbf{E}_{P^{\star k}}(F)  = 1 + (n-1)\Bigl( 1 - \frac{2}{n} \Bigr)^k

and that

\mathbf{Var}_{P^{\star k}}(F) = 1 + (n-1)\Bigl(1-\frac{2}{n}\Bigr)^k   - \frac{(n+1)(n-2)}{2} \Bigl(1-\frac{2}{n}\Bigr)^{2k}  \\  + \frac{(n-1)(n-2)}{2} \Bigl(1-\frac{4}{n}\Bigr)^k.

It follows that if k = \frac{n}{2} \log n - bn where b > 0, then \mathbf{E}_{P^{\star k}}(F) \rightarrow 1 + \mathrm{e}^{2b} and \mathbf{Var}_{P^{\star k}}(F) \rightarrow 1 + \mathrm{e}^{2b} as n \rightarrow \infty. One can then use Chebychev’s inequality to show that, when k and n are in this relationship, the number of fixed points of a permutation selected according to P^{\star k} is likely to be rather large: of order \mathrm{e}^{2b}. Under the uniform distribution U the number of fixed points of a permutation is approximately distributed according to the Poisson distribution with parameter 1, so in particular, permutations with many fixed points are very rare. This gives a bound on the variation distance ||P^{\star k}  - U||.

More precisely, if M = \mathrm{e}^{2b}/2 then

||P^{\star k} - U|| \ge \mathbf{P}_{P^{\star k}}(F > M) - \mathbf{P}_{U}(F > M) \ge 1 - \frac{6}{\mathrm{e}^{2b}-2}.

This shows that \frac{n}{2} \log n - bn steps do not suffice to give a good shuffle. Therefore the immediately preceding result in the book, that \frac{n}{2} \log n + bn steps are sufficient to make ||P^{\star k} - U|| small, is sharp.

One interesting feature is that if the probability of choosing the identity is reduced from 1/n then it can take more steps to make ||P^{\star k} - U|| small. The reason is that if after k steps there is a good chance that the identity has never been chosen, then the parity of the shuffle must be even if k is even, and odd if k is odd. In these cases the bound on the variation distance given by parity is stronger than the bound coming from fixed points.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: