Power Calculations for Tests on a Vector of Binary Outcomes (MULTBINPOW)

MULTBINPOW is a SAS simulation program which enables study planners to choose the most powerful among existing tests (see list below) when the outcome of interest is a composite binary-event endpoint.  Mascha and Imrey (2010) compare the relative powers of standard and multivariate (GEE common effect and distinct effect) statistical tests for such situations. Relative power of the tests depends on several factors, including the size and variability of incidences and treatment effects across components, within-subject correlation, and whether larger or smaller components are affected by treatment. The average relative effect test (#5 below) is particularly useful in preventing the overall treatment effect estimate from being driven by components with the highest incidence, which is especially important in situations in which both incidences and treatment effects vary across components (e.g., Figs 1 and 2). This program allows the researcher to customize power calculations by choosing combination of these factors for a particular study, and then comparing power across the relevant statistical tests.  Address questions or suggestions to maschae@ccf.org

fig4a_final.tif fig4b_final.tif
Fig 1.  Comparative power when a smaller frequency component more affected by treatment           Fig 2.  Comparative power when larger frequency component more affected by treatment

Empirical power is calculated via simulations for 8 distinct statistical tests, each comparing 2 vectors of binary events (treatment versus control) under user-specified ranges of within-subject correlations, covariance structures, sample sizes and incidences.  

Power is computed for the following tests:

1.  Collapsed composite of any-versus-none of the events
2.  Count of events within individual (Mann-Whitney and T-test)
3.  Minimum P-value test
4.  GEE common effect test (also called “global odds ratio” test).
5.  GEE average relative effect (distinct effects test, removes influence of high incident components)
6.  GEE K-df distinct effects test (analogous to Hotelling’s T2, not sensitive to direction)
7.  GEE covariance-weighted distinct effects test (usually very similar power to GEE common effect)
8.  GEE treatment-component interaction distinct effects test (whether treatment effect varies across vector)


Sample call
.    For example, power calculations might be desired with the following specifications:

  • Compare 2 vectors of 3 proportions each, with population proportions of
       Treatment: (0.18 0.16 0.09) and Control: (0.20 0.20 0.15),  with at alpha=0.05
  •  Weight outcomes for each group as 0.2,  0.4,  0.4, respectively. 
  •  Assess power at within-subject correlations of 0.10, 0.30, 0.50 and N/group of 200, 400, 600.
  •  Produce 1000 simulations for each combination of within-subject correlation and N/group.
  •  Assume underlying exchangeable correlation for simulations, but use unstructured to analyze data.
  •  Summarize power in tables and graph; save individual simulation results in SAS dataset to be analyzed or appended to later.

     MULTBINPOW would then be called as follows:
     %let mypath=your directory to store simulation dataset results;
     %inc "/your directory/multbinpow.sas"; 

     %multbinpow(path=&mypath, reset=0, sims=1, summary=1,  outdata=results, imlwt=0, startsim=1,
     numsims=100, pr_t= .18 .16 .09, pr_c= .20  .20  .15, rmin=10, rmax=50, rby=20,nmin=200,nmax=600,nby=200,
     rmat=2, use_rmat=3, wt=.2 .4 .4, spar_ck=0, where=,simseed=1234, alpha=.05, listres=0, clearlog=1);

Additional options: increase the number of simulations without starting over; turn on the LISTRES option to see all analysis results for 1 or more simulations; assess the potential range of within-subject correlations for the given set of underlying proportions before running simulations.  

For explanation of all options, see the header in the MULTBINPOW  Program File or the README file.

Output:

The above call generates the following datasets, figure and output.

  1. Main results for each simulation run are saved to a SAS file with the user-specified filename, &outdata. This SAS dataset contains one observation for each simulation run for each specified combination of within-subject correlation and per-group sample size.  Data for each simulation include the observed vectors of response proportions for treatment, control and overall, all parameter and test results, and a counter of simulation run (SIMI).  Results are also saved to separate files particular to each specified combination of within-subject correlation and per-group sample size.

  2. A summary dataset named &outdata._sum is also created when summary=1 and makesum=1 are used. It has one record for each combination of NPER and RHO specified, and summary statistics for power, all relevant parameter estimates and standard errors, correlations, outcome vectors, etc.

  3. Tabulated results.  The generated output ( .lst file) gives the power for each test and summary statistics for the estimated vectors of proportions, working correlation matrix (partial results), and estimated treatment effects and standard errors for the main tests across the simulations.  Tabulations are made using the dataset &outdata._sum. Default output for the above macro call is as follows: 

 Power calculations for multivariate binary outcomes based on 500 simulations               
   PT= .18 .16 .09, PC= .20  .20  .15   Component weights= .2 .4 .4  SASprogram =temp11
             Covariance: simulation=exchangeable, analysis=unstructured  

                                                       N/Group
                                       200               400               600
                                       Rho               Rho               Rho
                               0.100 0.300 0.500 0.100 0.300 0.500 0.100 0.300 0.500

                                Mean  Mean  Mean  Mean  Mean  Mean  Mean  Mean  Mean

Number of components            3.00  3.00  3.00  3.00  3.00  3.00  3.00  3.00  3.00
Pc1                             0.20  0.20  0.20  0.20  0.20  0.20  0.20  0.20  0.20
Pc2                             0.20  0.20  0.20  0.20  0.20  0.20  0.20  0.20  0.20
Pc3                             0.15  0.15  0.15  0.15  0.15  0.15  0.15  0.15  0.15
Pt1                             0.18  0.18  0.18  0.18  0.18  0.18  0.18  0.18  0.18
Pt2                             0.16  0.16  0.16  0.16  0.16  0.16  0.16  0.16  0.16
Pt3                             0.09  0.09  0.09  0.09  0.09  0.09  0.09  0.09  0.09
GEE working corr(1,2)           0.08  0.25  0.41  0.08  0.25  0.41  0.08  0.25  0.42
GEE working corr(1,3)           0.06  0.19  0.32  0.06  0.19  0.32  0.06  0.19  0.32
Beta collapsed composite       -0.32 -0.26 -0.23 -0.32 -0.28 -0.24 -0.32 -0.27 -0.23
SE(B) collapsed composite       0.21  0.21  0.22  0.15  0.15  0.16  0.12  0.12  0.13
Beta common effect             -0.34 -0.32 -0.33 -0.33 -0.35 -0.34 -0.34 -0.34 -0.34
SE(B) common effect             0.18  0.20  0.22  0.13  0.14  0.16  0.10  0.12  0.13
Beta average relative effect   -0.37 -0.36 -0.37 -0.36 -0.38 -0.37 -0.37 -0.37 -0.37
SE(B) avg rel effect            0.19  0.21  0.22  0.13  0.14  0.16  0.11  0.12  0.13
Pow collapsed comp (Wald)       0.35  0.22  0.18  0.60  0.48  0.37  0.76  0.59  0.49
Pow Count (MW)                  0.38  0.28  0.20  0.64  0.54  0.43  0.85  0.68  0.57
Pow Count (pooled t)            0.42  0.29  0.25  0.67  0.62  0.51  0.87  0.77  0.67
Pow common effect (Wald)        0.49  0.36  0.31  0.74  0.70  0.60  0.92  0.83  0.76
Pow avg rel effect (Wald)       0.53  0.40  0.36  0.78  0.76  0.66  0.94  0.87  0.81
Pow avg rel effect(Score)       0.54  0.40  0.36  0.78  0.76  0.66  0.94  0.87  0.81
Pow K-df distinct(Wald)         0.40  0.30  0.33  0.67  0.65  0.62  0.86  0.83  0.82
Pow min bootstrap P-val         0.39  0.32  0.34  0.65  0.67  0.64  0.84  0.84  0.84
Pow TX-component interaction
(Score)                         0.17  0.20  0.25  0.30  0.35  0.39  0.39  0.51  0.59

  1. Graphs of power versus within-subject correlation by test are saved to a user-specified file, as below.
    Graphs are made using the dataset &figdata, with variables nper, r, meanvar (power) and test.
W:\composite\simulations\production\results\temp11\powerfigure.jpg

Download program.  Download the MULTBINPOW program here.  Runtimes vary proportional to the specified sample sizes and number of scenarios chosen.  Proc IML is invoked only for test #7 listed above, which substantially increases runtime.  To not include that test, use option IMLWT=0. 

References

Mascha EJ, Imrey PB: Factors affecting power of tests for multiple binary outcomes. Statistics in Medicine 2010; 29: 2890-2904.

Oman Samuel D: Easily simulated multivariate binary distributions with given positive and negative correlations. Computational Statistics & Data Analysis 2009; 53(4):999-1005.

Zeger Scott L, Liang Kung-Yee: Longitudinal Data Analysis for Discrete and Continuous Outcomes. Biometrics 1986, 42(1):121-130.

Lefkopoulou M, Ryan L: Global tests for multiple binary outcomes. Biometrics 1993, 49(4):975-988.

Legler  JM, Lefkopoulu  Myrto, Ryan  Louise M: Efficiency and Power of Tests for Multiple Binary Outcomes. Journal of the American Statistical Association 1995, 90(430):680-693.