There are some ways to assign probability measures to the set of antural numbers. Consider the probability measure Ps on the positive integers which assigns "probability" n−s/zeta(s) to the integer n. (s is a constant real number greater than 1.)
Then under this measure being a multiple of r and a multiple of s are independent events, in the probabilistic sense, if r and s don't have a common multiple. You can show this starting form the fact that the measure assigned to the set of multiples of k, for some positive integer k, is
1overzeta(s)sumin=1nfty1over(kn)s=1overzeta(s)1overkszeta(s)=1overks.
That is, the probability that a random positive integers is divisible by k is k−s. Of course you really want all integers to be equally likely, which should correspond to s=1.
(I learned this from Gian-Carlo Rota, Combinatorial Snapshots. Link goes to SpringerLink; sorry if you don't have access.)
Under "suitable conditions", which I don't know what they are because Rota doesn't say, the density of any set of natural numbers A is the limit limsto1+Ps(A).
In particular it might be reasonable to define correlation between sets of natural numbers in the same way. Let A and B be two sets of natural numbers. Let X and Y be the indicator random variables of the sets A and B in the measure Ps. The Pearson correlation coefficient between X and Y is
(E(XY)−E(X)E(Y))oversigmaXsigmaY
where E is expectation and sigma is standard deviation. Of course this can be simplified in the case where X and Y are indicators (and thus only take the values 0 or 1) -- in particular it simplifies to
Ps(AcapB)−Ps(A)Ps(B)oversqrtPs(A)Ps(B)(1−Ps(A))(1−Ps(B))
We could then deifne the correlation between A and B to be the limit of this as sto1+.
In the case where A is the event divisible by 2'', for example, and $B$ is the event
divisible by 3'', then AcapB is the event ``divisible by 6''. So Ps(AcapB)=6−s, Ps(A)=2−s, and Ps(A)=3−s, so the numerator here is 0 and so the correlation is zero.
But in the case where A is the event divisible by 4'' and $B$ is the event
divisible by 6'', then AcapB is the event ``divisible by 12''. So the correlation with respect to Ps is
12−s−24−soversqrt4−s6−s(1−4−s)(1−6−s)
which has the limit 1/sqrt15 as sto1+; more generally the correlation between being divisible by a and being divisible by b is
ab−lcm(a,b)overlcm(a,b)sqrt(a−1)(b−1)
and this may or may not be what you want.
No comments:
Post a Comment