Tuesday, 12 October 2010

it.information theory - Sanov's Theorem and Chernoff bound

You don't need to restrict yourself to finite E -- you get it for any set E of distributions. In hypothesis testing people often take E to be come convex set of distributions.



[added after comment]:
I think the question is confusingly stated. The probability being calculated is not the probability of a hypothesis per se. Cover and Thomas simplify the notation a bit. let Xn=(X1,ldots,Xn) be drawn i.i.d. according to Q on a finite set mathcalX. Let E be any set of distributions. Let TXn be the empirical distribution of X1,ldots,Xn. Sanov's Theorem says:



mathbbP(Xn:TXninE)le(n+1)|mathcalX|exp(ncdotminPinED(P|Q))



This bound holds for any n, but is only optimal asymptotically as ntoinfty. That is, the exponent won't be better than this divergence. For smaller n you can get better bounds, especially if you can get rid of that polynomial factor. The looseness in Sanov's theorem as stated is mostly in this polynomial factor -- if E is a singleton, then if you inspect the proof you see that you don't need that factor.



To take an example from coin tossing, suppose Q corresponds to a coin with bias 1/2 (a fair coin) and let E=P:P(heads)le1/4. This is of the form "sample mean is less than 1/4" (if I understand your question). Your complaint would seem to be that as stated Sanov makes you account for coins with bias much less than 1/4, when in fact you can "get away" with only considering P(heads)=1/4.



So if I understand your question now, the answer is : Sanov's theorem is loose because there is a union being taken over all distributions in E, but this looseness is mostly important for small n. The bound holds for all n but for smaller n you can get better bounds (e.g. Chernoff) by re-inspecting the proof of Sanov or through other methods. However, as ntoinfty these bounds will not be better than Sanov's in terms of the exponent multiplying n.



Hope that answers it!

No comments:

Post a Comment