ROCPOWER.SAS is a SAS macro for estimating the power of statistical tests involving the area under ROC curves (1 reader; 1 or 2 ROC curves). In the 1-curve case, the macro determines the power for comparing a single area (AUC) to a null value. In the 2-curve case, it determines the power of a test comparing either paired (correlated) or unpaired (uncorrelated) curves. The macro can handle either continuous data or rating data and can look at both 1-tailed and 2-tailed tests. Asymptotic z-tests are used for the comparison of ROC areas. This macro was written by Rick Zepp, now at Kimberly-Clark (rzepp@kcc.com). MACRO CALL ********** %ROCPOWER(T1, T2, T0, NA, NN, N, PERCENT, R, ALPHA, TAILS, ORDINAL, I, J) In the one-sample case, the null hypothesis is Ho: T1=T0. In the two-sample case, the null hypothesis is Ho: T1=T2=T0. PARAMETER DEFINITIONS ********************* I refers to the number of curves (1 or 2) you are interested in. It is set to 2. J refers to the number of readers in your study. It is set to 1. T1, T2, and T0 are hypothesized areas under ROC curves (AUCs). T0 is the AUC under the null hypothesis. T1 and T2 are the AUCs under the alternative hypothesis. For the 1-curve case, only T1 and T0 are specified. NOTE: Of the following 4 parameters, only 2 (either NN and NA *or* N and PERCENT) need to be specified. NA is the number of abnormal patients (i.e. "events") being tested by each modality in the study. It is assumed that the number of abnormals tested by each modality is the same. NN is the number of normal patients (i.e. "non-events") being tested by each modality in the study. It is assumed that the number of normals tested by each modality is the same. N is the total sample size (i.e. both normals and abnormals) being tested by each modality in the study. It is assumed that the number of cases tested by each modality is the same. PERCENT is the percentage of abnormal patients in the total sample size, N R is the correlation between T1 and T2 when the same patients are examined by both modalities. THIS IS NOT THE SAME AS THE CORRELATION BETWEEN RATINGS. Hanley and McNeil (1983) produced a lookup table in which values of R can be obtained based on the average area under the 2 ROC curves and the average correlation between ratings from the 2 modalities. Selected values are presented below: Average Area ************ Avg Corr Between Ratings .70 .75 .80 .85 .90 .95 .975 ************************ 0.10 .09 .09 .08 .08 .07 .06 .04 0.20 .18 .18 .17 .16 .15 .12 .10 0.30 .27 .27 .26 .25 .23 .19 .16 0.40 .37 .36 .35 .34 .32 .28 .24 0.50 .47 .46 .45 .43 .41 .37 .32 0.60 .57 .56 .55 .53 .51 .47 .42 0.70 .67 .66 .65 .64 .62 .58 .54 0.80 .77 .77 .76 .75 .73 .70 .67 0.90 .88 .88 .87 .87 .86 .84 .82 ALPHA is the type I error rate. The default has been set to 0.05. TAILS refers to whether a 1-tailed or 2-tailed test is desired. The default has been set to 2. ORDINAL refers to whether continuous (0) or ordinal (1) data has been used. The default has been set to continuous. EXAMPLES ******** 1) Blood glucose as a predictor of diabetes (1-curve situation) Suppose a new test based on measured amounts of blood glucose (a continuous measure) has been developed. The makers of the test claim that it is 92% accurate in diagnosing diabetes. You want to test whether the diagnostic accuracy is better than 75%. You plan to administer the test to 35 known diabetics and 50 known non-diabetics. You plan on using a .01 significance level. This situation describes a 1-tailed hypothesis test with Ho: T1 <= To = .75 Ha: T1 > .75 What power does the test have with the proposed sample size? The macro call would be: %ROCPOWER(T1=.92, T0=.75, NN=50, NA=35, ALPHA=.01, TAILS=1); with the corresponding output: Hypothesis tested: HO: T1 = .75 HA: T1 > .75 With: T1 hypothesized at .92 Alpha = .01 Number of Abnormal Cases = 35 Number of Normal Cases = 50 Standard Error(s) estimated using the Hanley- McNeil method The estimated Power of the test is 0.893 2) Comparing MRI and CT for detecting stenosis in the aorta (2 unpaired curves) Both MRI and CT can be used to detect stenosis in the aorta and you want to determine which imaging modality has a higher diagnostic accuracy. You expect the accuracy of both tests to be at least 70%; a difference in accuracy of 20% is clinically important. You plan to randomize a total of 200 patients to either MRI or CT (100 to each modality). The prevalence of stenosis in the population you are examining is 50%. Therefore we will assume that 50 stenotic and 50 non-stenotic patients will receive each test. Thus, NN and NA are both set to 50 in the macro call. A 1-5 rating scale is used. The macro call would be: %ROCPOWER(T1=.90, T2= .70, T0=.70, NN=50, NA=50, ORDINAL=1); with the corresponding output: Hypothesis tested: HO: T1 = T2 = .70 HA: T1 < T2 or T1 > T2 With: T1 hypothesized at .90 T2 hypothesized at .70 Alpha = .05 Number of Abnormal Cases = 50 Number of Normal Cases = 50 Standard Error(s) estimated using the Obuchowski method The estimated Power of the test is 0.781 3) Comparing MRI and CT for detecting stenosis in the aorta (2 paired curves) Suppose that the study described in Example 2 was performed using the same set of 100 cases for both MRI and CT and that the correlation between the 2 areas was estimated at .41. The macro call would be: %ROCPOWER(T1=.90, T2=.70, TO=.70, R=.41, N=100, PERCENT=.50, ORDINAL=1); with the corresponding output: Hypothesis tested: HO: T1 = T2 = .70 HA: T1 < T2 or T1 > T2 With: T1 estimated at .90 T2 estimated at .70 Alpha = .05 Percent of Abnormal Patients = 50% Total Sample Size = 100 Correlation between T1 and T2 = .41 Standard Error(s) estimated using the Obuchowski method The estimated Power of the test is 0.935