Low-powered research continues to undermine the replicability of research findings. Although contemporary methodological literature clearly outlines the negative consequences of neglecting power and offers numerous guidelines for performing power analyses, the prevalence of underpowered studies remains a persistent issue. This thesis investigates the current state of power analysis practices in contemporary research across different methodological subfields of cognitive science. Specifically, it examines how frequently power analyses are conducted, how transparently and consistently they are reported, and whether the rationale behind the chosen parameters is provided – especially in relation to conventional parameter thresholds.
The method of hypothesis testing was first introduced by Fisher in the early 20th century, but it was Neyman and Pearson who developed a systematic framework incorporating both Type I and Type II errors, which introduced the possibility of calculating power. Central to their approach is not only controlling the probability of false positive results (Type I error) but also minimising the probability of false negatives (Type II error), thus maximising statistical power. Despite extensive warnings from methodological literature about the consequences of low power and its neglect – starting notably with Cohen’s work in the 1969s – contemporary empirical literature continues to suffer from low power. The reason for this is puzzling given the widespread availability of power analysis tools and accessible guidelines.
In this thesis, we suggest that one of the reasons for the low power of contemporary research is the non-utilisation of power analysis. To explore this, we review contemporary methodological guidelines and teaching materials for performing power analysis. In the second part, we investigate whether suggestions from the literature are adhered to by conducting a systematic review of contemporary empirical research. We gathered studies from subfields of cognitive science differing in methodological approaches and categorised them into three groups – behavioural, neuroimaging, and neuropsychological. We evaluated each study using a coding procedure designed to evaluate the presence and type of power analysis, the frequency of parameter reporting, and the rationale provided for parameter selection.
We found that from a total sample of N = 355 studies, only 18.31% (N = 65) reported the use of power analysis. Differences between methodological groups were substantial. In the behavioural studies group, 30.72% of studies reported conducting power analysis. In contrast, only 8.42% neuropsychological and 6.31% neuroimaging studies reported power calculations. Encouragingly, researchers who opted for power analysis calculations also frequently reported specific parameters used in power analysis calculations – especially in the behavioural category. However, explicit justifications for these choices were rare apart from effect size, which was justified in 90.00% of cases. On further exploration, we found that power and effect size choices were often guided by conventions, as 64.15% of power values adopted the conventional .80 value, and 42.31% of effect size justifications referred to Cohen’s qualitative effect size benchmarks.
These findings suggest that power analysis remains significantly underutilised, supporting our proposal that low statistical power in contemporary research might stem from the non-utilization of power analysis – particularly within the neuropsychological and neuroimaging fields. Moreover, the frequent specification of power analysis parameters, the common use of conventional values, and the lack of further justification suggest continued reliance on arbitrary benchmarks. To address this, we propose 95% power as a substitute for the traditional .80 standard when no other rationale for setting power is available, with the aim of improving reliability and informativeness of scientific findings.
This master’s thesis highlights a gap between methodological recommendations and actual research practices, offering implications for enhancing the robustness and replicability of scientific findings. Overall, it contributes to the broader conversation on research quality and reform by highlighting the ongoing deficiencies in power analysis practices and offers recommendations that align with current methodological guidelines. By improving the quality and rigor of study planning and reporting, the field can move toward a more transparent and trustworthy cumulative science.
|