Sunday, 14 March 2010

mathematical modeling - Resultant probability distribution when taking the cosine of gaussian distributed variable

Given a normal distribution with mean mu and variance sigma2, X=mathcalN(mu,sigma2), if you pass it through trigonometric functions, you can approximate the result with the new normal distributions below



1) normal distribution passed through Cosine function:



Xcos=mathcalN(cos(mu),sigma2sin2(mu))



so the new average is cos(mu) and the new standard deviation is |sigmasin(mu)|.



2) normal distribution passed through a Sine function:



Xsin=mathcalN(sin(mu),sigma2cos2(mu))



so the new average is sin(mu) and the new standard deviation is |sigmacos(mu)|.



The Matlab script that I used to find these relations is below.



%% Cody Martin
% 9/2/2010
% m-file used to discover the mean and variance of a normal distribution
% passed through cosine and sine functions...results:
% - N(mu,sig^2) -> cos(N(mu,sig^2)) = N(cos(mu),sig^2*sin^2(mu))
% - N(mu,sig^2) -> sin(N(mu,sig^2)) = N(sin(mu),sig^2*cos^2(mu))

%% distribution of cosine and sine of a normal distribution?
cresults = zeros(0,5);
sresults = zeros(0,5);
% loop from an average angle -90 degrees to +90 degrees
for theta = -pi/2:pi/180:pi/2
theta1sig = pi/36; % standard deviation of orinigal normal distribution
vtheta = theta + theta1sig*randn(99999,1); % create 99999 points using this avg and std
vctheta = cos(vtheta); % take the cosine of those points
vstheta = sin(vtheta); % take the sine of those points
theta_ = min(vtheta):0.01:max(vtheta); % for plotting ideal distributions
ctheta_ = min(vctheta):0.01:max(vctheta); % for plotting
stheta_ = min(vstheta):0.01:max(vstheta); % for plotting

figure(1); clf;
subplot(211); hold on;
plot(theta_,cdf('normal',theta_,theta,theta1sig),':'); % plot cdf of normal distribution with avg and std
plot(sort(vtheta),[1:length(vtheta)]/length(vtheta)); % plot cdf of 99999 points
plot(sort(vctheta),[1:length(vctheta)]/length(vctheta),'k','LineWidth',2); % plot cdf of cos(99999 points)
plot(ctheta_,cdf('normal',ctheta_,cos(theta),... % plot cdf of norm dist with new avg and std after being passed through cos()
sqrt(theta1sig^2*sin(theta)^2)),'r:');
plot(cos(theta)*[1 1],[0 1],'k:'); % vertical line @ cos(theta) - shows new average matches cos(old avg)
title('Cosine of a Normal Distribution (for Different Initial Averages)');
legend('Norm CDF Theory','Norm CDF 99999','Cos(Norm CDF 99999)','Cos(Norm CDF) Theory');
axis([-pi/2 pi/2 0 1])

subplot(212); hold on;
plot(theta_,cdf('normal',theta_,theta,theta1sig),':');
plot(sort(vtheta),[1:length(vtheta)]/length(vtheta));
plot(sort(vstheta),[1:length(vstheta)]/length(vstheta),'k','LineWidth',2);
plot(stheta_,cdf('normal',stheta_,sin(theta),...
sqrt(theta1sig^2*cos(theta)^2)),'r:');
plot(sin(theta)*[1 1],[0 1],'k:');
title('Sine of a Normal Distribution (for Different Initial Averages)');
legend('Norm CDF Theory','Norm CDF 99999','Sin(Norm CDF 99999)','Sin(Norm CDF) Theory');
axis([-pi/2 pi/2 0 1])

% fprintf('theta: %3.0ftstd: %5.3ftsin(theta): %5.3ftavg: %5.3ftstd: %5.3fn',theta*180/pi,theta1sig,sin(theta),mean(vstheta),std(vstheta));
cresults = [cresults; theta theta1sig cos(theta) mean(vctheta) std(vctheta)];
sresults = [sresults; theta theta1sig sin(theta) mean(vstheta) std(vstheta)];
end

figure(2); clf;
subplot(211); hold on;
plot(cresults(:,1),cresults(:,end));
plot(cresults(:,1),abs(theta1sig*sresults(:,3)),'r:');
title('Standard Deviation of Cosine of a Normal Distribution as a Function of the Original Average');
legend('From 99999 Points','Fit: std = |sigmasin(mu)|');
ylabel('std(cos(theta_{vector})) [rad]');
xlabel('theta [rad]');

subplot(212); hold on;
plot(sresults(:,1),sresults(:,end));
plot(sresults(:,1),abs(theta1sig*cresults(:,3)),'r:');
title('Standard Deviation of Sine of a Normal Distribution as a Function of the Original Average');
legend('From 99999 Points','Fit: std = |sigmacos(mu)|');
ylabel('std(sin(theta_{vector})) [rad]');
xlabel('theta [rad]');

figure(3); clf;
subplot(211); hold on;
plot(cresults(:,1),abs(theta1sig*sresults(:,3))-cresults(:,end));
title('Error Between sigma^2sin^2(mu) and std of 99999 Draws of cos(theta)')
ylabel('Residual [rad]');
xlabel('theta [rad]');


subplot(212); hold on;
plot(sresults(:,1),abs(theta1sig*cresults(:,3))-sresults(:,end));
title('Error Between sigma^2cos^2(mu) and std of 99999 Draws of cos(theta)')
ylabel('Residual [rad]');
xlabel('theta [rad]');




As others have pointed out, this fails where cos(mu) and sin(mu) are near 0. Residuals between my proposed solution and the empirical results from 99999 draws are shown below.



enter image description here

No comments:

Post a Comment