Tuesday, 28 December 2010

pr.probability - most general way to generate pairwise independent random variables?

I'm sure that Gil's answer is wise and that it is a good idea to look at Alon and Spencer's book. Here also is a quick summary of what is going on.



Suppose that $X_1,ldots,X_n$ are random variables, and suppose for simplicity that they take finitely many values. Suppose that you prescribe the distribution of each $X_i$, and suppose also that you want the random variables to be pairwise independent or $k$-wise independent. Then the constraints on the joint distribution are a finite list of equalities and inequalities. The solution set is a polytope whose dimension is fairly predictable, and the fully independent distribution is always in the interior of this polytope. If you are interested in $k$-wise independent distributions that are far from $k+1$-wise independent, then it can be difficult to determine what is achievable because the polytope is complicated. (The vertices are a particularly interesting and non-trivial class: $k$-wise independent distributions with small support. These are called "weighted orthogonal arrays".) However, if you're just intersted in examples, it is much easier to write down a small deviation of the fully independent distribution. The deviation just satisfies linear equations.



For example, suppose that $X,Y,Z$ are three unbiased Bernoulli random variables (coin flips) that take values $0$ and $1$. Then there are 8 probabilities $p_{ijk}$, one for each outcome $(X,Y,Z) = (i,j,k)$. Then you can set
$$p_{ijk} = frac18 + (-1)^{ijk}epsilon. qquadqquadqquad text{(1)}$$
to get a pairwise independent but not independent distribution. In this simple example, there is a 1-dimensional space of deviations and it is easy to compute how far you can vary the independent solution. (Up to $|epsilon| = frac18$.) In larger cases, the variations can be multidimensional and the polytope of deviations can be more complicated.



Addendum: If I have not made a mistake, all deviations for any finite list of discrete random variables are linear combinations of those of the form (1). More precisely, given discrete random variables $X_1,ldots,X_n$, let $f_i$ be some function of the value of $X_i$ which is 1 for one value, $-1$ for another value, and $0$ otherwise. Then you can make deviations proportional to $prod f_{i_j}$ as long as there are at least $k+1$ factors. It looks like all deviations are a linear combination of those of this form.

No comments:

Post a Comment