Journal of Counseling Psychology 1983, Vol. 30, No, 3,459-463
Copyright 1983 by the American Psychological Association, Inc.
Statistical Significance, Power, and Effect Size: A Response to the Reexamination of Reviewer Bias
Bruce E. Wampold
Department of Educational Psychology University of Utah
Michael J. Furlong and Donald R. Atkinson
Graduate School of Education University of California, Santa Barbara
In responding to our study of the influence that statistical significance has on reviewers ' recommendations for the acceptance or rejection of a manuscript for publication (Atkinson, Furlong, & Wampold, 1982), Fagley and McKinney (1983) argue that reviewers were justified in rejecting the bogus study when nonsignificant*…show more content…*

To detect a small experimental effect in the bogus study, for example, we would have had to increase the sample size from 81 to 1,206, or 134 subjects 459 460 COMMENTS argument is that because the average effect size for published research was equivalent to that of a medium effect, the reviewer 's decision to reject the bogus manuscript under the nonsignificant condition was "reasonable." Further examination of the Haase et al. (1982) article and our own analysis of published research, however, demonstrates that the power of the bogus study was great enough to detect effect sizes that are typical of research published in JCP, which was our intention when we designed the bogus study. First, although the median effect size (if) for all univariate statistical tests, significant and nonsignificant, reported by Haase et al. (1982) was .083, this index was steadily increasing at a rate of approximately .5% per year, so that the projected median if- in 1981 (the year our study was completed) would be .13. Importantly, an r)2 of .13 corresponds to an effect size (/) of .39, which Cohen (1977) designates as a large effect. A further examination of the Haase et al. (1982) data also lends support to our argument. Their analysis examined the strength of association for 11,044 univariate statistical tests derived from only 701 manuscripts; thus, each manuscript reported an average of more than 15 statistical tests. Since statistically significant and

