Wednesday 6 June 2012

st.statistics - Population Spearman Rank Correlation Coefficient

Let p(x,y) be the joint probability density function of the random variables X and Y. Let P_x(x) and P_y(y) the marginial cumulative distribution functions respectively. The key observation is that the normalized rank of a sample of x (i.e., its rank divided by the number of observations R(x_i)/n) is just a sample of the random variable P_x(X). Thus, it is not hard to convince oneself that the statistic:



Rho = 1-6(P_x(X)-P_y(Y))^2 is an estimator of the Spearman rank correlation, and its population mean is the population's Spearman rank coefficient is given by:



rho = 1 - 6 int ((P_x(x)-P_y(y))^2 p(x,y) dxdy)



The following article performs the same calculation for a weighted version of the Spearman's correlation coefficient:



http://www.ine.pt/revstat/pdf/rs060301.pdf



I think that the sample Spearman is unbiased because of the averaging by n*(n-1)*(n+1), but I still don't know how to prove that.



Please, notice that the population mean of the statistic (the population Spearman correlation coefficient) becomes zero when the random variables are independent, i.e., p(x,y) = p(x)*p(y).

No comments:

Post a Comment