@Program name: mean_ci.prg @ @Purpose: develop bootstrap confidence intervals around a population mean@ @Used for: 1996 course in non-parametric inference, Essex Summer School @ @Author: Chris Mooney @ @Date: June 20, 1996 @ new; t1=hsec;@for seting the timer@ @rndseed 234;@ @Sets the random seed @ output file = mean_ci.out; @Set output filename@ output reset; @Output file will be overwritten each run@ library pgraph; graphset; @Parameters to be set for the Bootstrap run@ n=100; @sample size@ tcal=1.98; @t-score for CI's. Note: adjust when n is changed @ b=1000; @Number of bootstrap re-samples@ chi_df = 1; @parameter for chi-square X distribution@ print"n =" n; print "b=" b; print "tcal = " tcal; print; i=1 ; j=1; @Set indices for re-sample, and jackknife for BCa@ prop=0; @Set proportion base for z0 of BCa@ x_bar_bt = zeros(B,1); @Define empty vector for x-bar*'s@ sd_xb_bt = zeros(B,1); @Define empty vector for SD's of x-bar*'s@ t_boot = zeros(B,1); @Define empty vector for standardized x-bar*'s@ mn_jack = zeros(n,1); @Define empty vector for jackknifed means@ @Generation of data with various distributions@ uniform = rndu(n,1); normal = rndn(n,1); c = rndn(n,chi_df); c = c^2 ; c = c' ; chi = sumc(c) ; student = normal ./ sqrt(chi/chi_df); x = normal; @Defining the distribution of x @ @Calculating the sample mean of x for the full sample@ x_bar = meanc(x); sd_x = stdc(x); sd_x_bar = sd_x/sqrt(n); @ Resampling @ do while i<=b; index=ceil(rndu(n,1)*n); @Set up re-sampling index@ x_b=x[index,.]; @Defines re-sample b vector of Xi*'s@ x_bar_b = meanc(x_b); @Calculating X-bar* for a re-sample@ sd_xb_b = stdc(x_b)/sqrt(n); @Calculating the SD of x-bar for a re-sample@ x_bar_bt[i,1] = x_bar_b; @Setting up vector of bootstrapped x-bar*'s@ sd_xb_bt[i,1] = sd_xb_b; @Setting up vector of SD's of x-bar*'s@ @Setting up for BCa zit@ if x_bar_b <= x_bar; prop=prop+1/b; endif; i=i+1; endo; @end of resampling loop@ @Constructing parametric confidence interval @ pmlcl = x_bar - (tcal*sd_x_bar); pmucl = x_bar + (tcal*sd_x_bar); @Constructing normal approximation confidence interval @ nalcl = x_bar - (tcal * stdc(x_bar_bt)); naucl = x_bar + (tcal * stdc(x_bar_bt)); @Constructing percentile confidence interval @ x_bar_bt = x_bar_bt~sd_xb_bt; @Attach x-bar*'s and their sd'd for sorting together (needed for percentile-t)@ x_bar_bt = sorthc(x_bar_bt,1); @Sort x_bar*'s vector @ @Defining the CI percentile points for alpha = 0.05@ ll = round((b*0.05)/2); ul= round((b-ll)+1); @Defining lower and upper percentile CI endpoints@ perlcl = x_bar_bt[ll,1]; perucl = x_bar_bt[ul,1]; @Constructing percentile-t confidence interval@ @Standardize the x-bar*'s using the SD from each re-sample and the full sample mean@ t_boot = (x_bar_bt[.,1] - x_bar) ./ x_bar_bt[.,2]; t_boot = sorthc(t_boot,1); @Sort the vector of t_boot@ @Define percentile-t endpoints. NOTE: subtract t_boot conversion for opposite end from both limits@ prtlcl = x_bar - (t_boot[ul] * sd_x_bar); prtucl = x_bar - (t_boot[ll] * sd_x_bar); @Constructing BCa confidence interval@ @Step #1: Calculate the acceleration factor, a @ @Jackknifing for BCa @ do while j<=n; xj=submat(x,j,0); @Defines xj as vector with only case j of x @ xjack=setdif(x,xj,1); @Defines xjack as those cases in x, but not in xj (so all but case j)@ xjck_bar = meanc(xjack); @Calculate the mean of xjack)@ mn_jack[j,1] = xjck_bar; @Placing the x-bar(i) into the jackknifed x-bar vector@ j=j+1; @Increasing jackknife index@ endo; @end jackknifing loop@ @Calculating acceleration factor, a, from the jackknifed vector of x-bar's@ mxjck = meanc(mn_jack); @Mean of jackknifed x-bar`s@ a_num = sumc((mn_jack-mxjck)^3); @Numerator for acceleration factor@ @Calculate the denominator for a, setting format for small numbers@ a_den = (6*(sumc((mn_jack-mxjck)^2))^1.5); format /rd 2,9; if a_den == 0; @To assure division by 0 doesn`t occur@ a=0; else; a = a_num/a_den; @Calculate a @ endif; @Step #2: Calculate z0 @ @Making sure prop =/ 0 or 1 (so zit =/ infinity)@ if prop == 0; prop = .0001; elseif prop == 1; prop = .9999; else; prop = prop; endif; zp=0;zit=-3.00; @Set starting points for z0 calculations@ do until zp > prop; @Loop to increment zit = z0 until it is the z-score for the proportion of x-bar*'s below the full sample x-bar@ zp=cdfn(zit); @zp is the z-score associated with zit@ zit=zit+.001; @Increment zit @ endo; zit = zit-0.001; @Delete the last zit increment @ @Step #3: Construct the BCa confidence interval@ @Define the lower and upper BCa .05 percentile points@ bc_l_per = cdfn(zit+((zit-1.96)/(1-(a*(zit-1.96))))); bc_u_per = cdfn(zit+((zit+1.96)/(1-(a*(zit+1.96))))); @Calculate the bootstrap vector case number for the lower and upper BCa bounds @ l_bca = round(b*bc_l_per); @Lower endpoint case number@ if l_bca == 0; @To avoid case #0 being choosen@ l_bca = 1; else; l_bca = l_bca; endif; u_bca = round(b*bc_u_per); @Upper endpoint case number@ if u_bca > b; @To avoid case #(B+1) being choosen@ u_bca = b; else; u_bca = u_bca; endif; bcalcl = x_bar_bt[l_bca,1]; @Lower BCa CI endpoint@ bcaucl = x_bar_bt[u_bca,1]; @Upper BCa CI endpoint@ @Conducting test of normality on x @ {b1,b2,wb,rw}=jarqbera(x,n); @Conducting test of normality on bootstrapped x-bar* vector@ {sk,ku,w,p_w}=jarqbera(x_bar_bt[.,1],B); print; print "Full sample x-bar value = " x_bar; format /rd 2,5; print "Full sample SD(x-bar) = " sd_x_bar; format /rd 2,5; print; @Leaves space between output lines@ print "Bootstrap estimate of x-bar =" meanc(x_bar_bt[.,1]); print "Bootstrap estimate of SD(beta) = " stdc(x_bar_bt[.,1]); print; print "Parametric CI = " pmlcl pmucl; print "Normal approx CI = " nalcl naucl; print "Percentile CI =" perlcl perucl; print "Percentile-t CI =" prtlcl prtucl; print "BCa CI=" bcalcl bcaucl; print; print "Normality tests on x "; print "Skew of x = " b1; print "Kurtosis of x = " b2; print "Jarque-Bera omnibus test statistic = " wb; print "Probability of normality = " rw; print; print "Normality tests on vector of x-bar*'s:"; print "Skew of x-bar* =" sk; print "Kurtosis of x-bar* =" ku; print "Jarque-Bera omnibus test statistic =" w; print "Probablity of normality =" p_w; @Defining the proc for Jarque-Bera normality testing@ proc(4) = jarqbera(vartest,n1) ; local m5, m2, m3, xbar,sskew, skew, y, b2skew, w, var,m1, m4, kurt, prob_w; skew = 1; kurt = 1; m1 = (vartest)-meanc(vartest); m3 = sumc(m1^3)/n1; m4 = sumc(m1^4)/n1; var=(stdc(m1))^2; sskew = (m3/(var^1.5)) ; skew=sskew^2; kurt = m4/((var)^2) ; @Calculating the Bera-Jarque W, omnibus test for normality@ w = n1*(((skew)/6)+(((kurt-3)^2)/24)); prob_w = cdfchic(w,2); retp (sskew,kurt,w,prob_w) ; endp ;