mathematical modeling - Resultant probability distribution when taking the cosine of gaussian distributed variable

Sunday, 14 March 2010

mathematical modeling - Resultant probability distribution when taking the cosine of gaussian distributed variable

Given a normal distribution with mean $mu$ and variance $sigma^2$, $X = mathcal{N}(mu,sigma^2)$, if you pass it through trigonometric functions, you can approximate the result with the new normal distributions below

1) normal distribution passed through Cosine function:

$X_{cos} = mathcal{N}(cos(mu),sigma^2sin^2(mu))$

so the new average is $cos(mu)$ and the new standard deviation is $|sigmasin(mu)|$.

2) normal distribution passed through a Sine function:

$X_{sin} = mathcal{N}(sin(mu),sigma^2cos^2(mu))$

so the new average is $sin(mu)$ and the new standard deviation is $|sigmacos(mu)|$.

The Matlab script that I used to find these relations is below.

%% Cody Martin
% 9/2/2010
% m-file used to discover the mean and variance of a normal distribution
% passed through cosine and sine functions...results:
%   - N(mu,sig^2) -> cos(N(mu,sig^2)) = N(cos(mu),sig^2*sin^2(mu))
%   - N(mu,sig^2) -> sin(N(mu,sig^2)) = N(sin(mu),sig^2*cos^2(mu))

%% distribution of cosine and sine of a normal distribution?
cresults = zeros(0,5);
sresults = zeros(0,5); 
% loop from an average angle -90 degrees to +90 degrees
for theta = -pi/2:pi/180:pi/2
    theta1sig = pi/36;                          % standard deviation of orinigal normal distribution
    vtheta = theta + theta1sig*randn(99999,1);  % create 99999 points using this avg and std
    vctheta = cos(vtheta);                      % take the cosine of those points
    vstheta = sin(vtheta);                      % take the sine of those points
    theta_ = min(vtheta):0.01:max(vtheta);      % for plotting ideal distributions
    ctheta_ = min(vctheta):0.01:max(vctheta);   % for plotting
    stheta_ = min(vstheta):0.01:max(vstheta);   % for plotting

    figure(1); clf;
    subplot(211); hold on;
    plot(theta_,cdf('normal',theta_,theta,theta1sig),':');  % plot cdf of normal distribution with avg and std
    plot(sort(vtheta),[1:length(vtheta)]/length(vtheta));   % plot cdf of 99999 points
    plot(sort(vctheta),[1:length(vctheta)]/length(vctheta),'k','LineWidth',2); % plot cdf of cos(99999 points)
    plot(ctheta_,cdf('normal',ctheta_,cos(theta),...        % plot cdf of norm dist with new avg and std after being passed through cos()
         sqrt(theta1sig^2*sin(theta)^2)),'r:');
    plot(cos(theta)*[1 1],[0 1],'k:');                      % vertical line @ cos(theta) - shows new average matches cos(old avg)
    title('Cosine of a Normal Distribution (for Different Initial Averages)');
    legend('Norm CDF Theory','Norm CDF 99999','Cos(Norm CDF 99999)','Cos(Norm CDF) Theory');
    axis([-pi/2 pi/2 0 1])

    subplot(212); hold on;
    plot(theta_,cdf('normal',theta_,theta,theta1sig),':');
    plot(sort(vtheta),[1:length(vtheta)]/length(vtheta));
    plot(sort(vstheta),[1:length(vstheta)]/length(vstheta),'k','LineWidth',2);
    plot(stheta_,cdf('normal',stheta_,sin(theta),...
         sqrt(theta1sig^2*cos(theta)^2)),'r:');
    plot(sin(theta)*[1 1],[0 1],'k:');
    title('Sine of a Normal Distribution (for Different Initial Averages)');
    legend('Norm CDF Theory','Norm CDF 99999','Sin(Norm CDF 99999)','Sin(Norm CDF) Theory');
    axis([-pi/2 pi/2 0 1])

%   fprintf('theta: %3.0ftstd: %5.3ftsin(theta): %5.3ftavg: %5.3ftstd: %5.3fn',theta*180/pi,theta1sig,sin(theta),mean(vstheta),std(vstheta));
    cresults = [cresults; theta theta1sig cos(theta) mean(vctheta) std(vctheta)];
    sresults = [sresults; theta theta1sig sin(theta) mean(vstheta) std(vstheta)];
end

figure(2); clf;
subplot(211); hold on;
plot(cresults(:,1),cresults(:,end));
plot(cresults(:,1),abs(theta1sig*sresults(:,3)),'r:');
title('Standard Deviation of Cosine of a Normal Distribution as a Function of the Original Average');
legend('From 99999 Points','Fit: std = |sigmasin(mu)|');
ylabel('std(cos(theta_{vector})) [rad]');
xlabel('theta [rad]');

subplot(212); hold on;
plot(sresults(:,1),sresults(:,end));
plot(sresults(:,1),abs(theta1sig*cresults(:,3)),'r:');
title('Standard Deviation of Sine of a Normal Distribution as a Function of the Original Average');
legend('From 99999 Points','Fit: std = |sigmacos(mu)|');
ylabel('std(sin(theta_{vector})) [rad]');
xlabel('theta [rad]');

figure(3); clf;
subplot(211); hold on;
plot(cresults(:,1),abs(theta1sig*sresults(:,3))-cresults(:,end));
title('Error Between sigma^2sin^2(mu) and std of 99999 Draws of cos(theta)')
ylabel('Residual [rad]');
xlabel('theta [rad]');


subplot(212); hold on;
plot(sresults(:,1),abs(theta1sig*cresults(:,3))-sresults(:,end));
title('Error Between sigma^2cos^2(mu) and std of 99999 Draws of cos(theta)')
ylabel('Residual [rad]');
xlabel('theta [rad]');

As others have pointed out, this fails where $cos(mu)$ and $sin(mu)$ are near 0. Residuals between my proposed solution and the empirical results from 99999 draws are shown below.

enter image description here

Answer Desk

Sunday, 14 March 2010

mathematical modeling - Resultant probability distribution when taking the cosine of gaussian distributed variable

No comments:

Post a Comment