You don't need to restrict yourself to finite E -- you get it for any set E of distributions. In hypothesis testing people often take E to be come convex set of distributions.
[added after comment]:
I think the question is confusingly stated. The probability being calculated is not the probability of a hypothesis per se. Cover and Thomas simplify the notation a bit. let Xn=(X1,ldots,Xn) be drawn i.i.d. according to Q on a finite set mathcalX. Let E be any set of distributions. Let TXn be the empirical distribution of X1,ldots,Xn. Sanov's Theorem says:
mathbbP(Xn:TXninE)le(n+1)|mathcalX|exp(−ncdotminPinED(P|Q))
This bound holds for any n, but is only optimal asymptotically as ntoinfty. That is, the exponent won't be better than this divergence. For smaller n you can get better bounds, especially if you can get rid of that polynomial factor. The looseness in Sanov's theorem as stated is mostly in this polynomial factor -- if E is a singleton, then if you inspect the proof you see that you don't need that factor.
To take an example from coin tossing, suppose Q corresponds to a coin with bias 1/2 (a fair coin) and let E=P:P(heads)le1/4. This is of the form "sample mean is less than 1/4" (if I understand your question). Your complaint would seem to be that as stated Sanov makes you account for coins with bias much less than 1/4, when in fact you can "get away" with only considering P(heads)=1/4.
So if I understand your question now, the answer is : Sanov's theorem is loose because there is a union being taken over all distributions in E, but this looseness is mostly important for small n. The bound holds for all n but for smaller n you can get better bounds (e.g. Chernoff) by re-inspecting the proof of Sanov or through other methods. However, as ntoinfty these bounds will not be better than Sanov's in terms of the exponent multiplying −n.
Hope that answers it!
No comments:
Post a Comment