Public trust in psychological science is essential for researchers and society. For example, researchers depend on public funding, while societal actors (e.g., practitioners, policymakers) trust research outcomes to base decisions on (see Wingen et al., 2020). People generally trust science (Smith et al., 2021; Wissenschaft im Dialog, 2021). Yet, failed replications can jeopardize this trust.
In the 2010s, systematic and multi-site replications indicated low replicability of psychological studies (also referred to as a “replication crisis”; for a review, see Nosek et al., 2022). Low replicability can damage public trust: When reading that only 39 out of 100 studies could be replicated, participants trusted the psychological science community much less than those reading about an 83% replication rate (Studies 2–3; Wingen et al., 2020; see also Anvari & Lakens, 2018).
At the same time, the replication crisis has fueled debates on good research practices, initiated reforms and policy changes in the field (e.g., open science principles for publishing and funding), and contributed to a changing research culture (Nosek et al., 2022). Do these changes (hereafter, referred to as reforms) increase public trust in psychological science and its community? Here, we address this question.
Trust Repair Mechanisms
From a theoretical point of view, initiated reforms could repair damaged trust. We refer to trust as an “intention to accept vulnerability” because the public expects positive intentions and behavior from the scientific community (Rousseau et al., 1998, p. 395). For example, the public funds research and bases decisions on findings because it expects the scientific community to report replicable findings. Research on trust repair identified several strategies and mechanisms that restore trust, including transparency, sense-making, regulation, and ethical culture (see Bachmann et al., 2015). Interestingly, these strategies are part of several reforms for replicability.
Transparency refers to the clear and accurate disclosure of relevant information about procedures and findings (Schnackenberg & Tomlinson, 2016). It enables monitoring and evaluation of findings and researchers’ trustworthiness. Moreover, transparency can signal that researchers have nothing to hide (Bachmann et al., 2015). Indeed, the open science movement aims at increasing transparency by promoting preregistration, open data, materials, code, and access (e.g., Miguel et al., 2014).
Sense-making refers to establishing an accepted account of why something happened (Tomlinson & Mayer, 2009). It should be noted that explanations, especially if they suggest wrongdoing worse than expected, can also weaken trust. Thus, sense-making is not always sufficient to repair trust, but it is essential to understand what needs to be changed (Bachmann et al., 2015). Accordingly, explanations of low replicability (e.g., questionable research practices) may not repair trust. However, they are highly relevant to evaluate whether reforms can prevent future trust violations.
Regulations such as adjusted policies, rules, codes of conduct, or incentives explicitly show what behavior is acceptable (Dirks et al., 2009; Gillespie & Dietz, 2009). For instance, a company that bribed governmental officials for decades implemented measures against corruption; Actions such as the presence of an ombudsperson or guidelines on presents helped to restore stakeholders’ trust (Eberl et al., 2015). In psychology, several funders and journals included open practices in their guidelines and offer new incentives (e.g., badges for open practices, registered reports; Nosek et al., 2022).
Ethical culture refers to social norms that provide informal standards for desirable behaviors (Bachmann et al., 2015). In the last decade, debates on (non-)replicability have influenced norms for research practices. Examples of changed behaviors include increasing sample sizes (e.g., in registered reports; Soderberg et al., 2021), preregistering and sharing data/ material (Christensen et al., 2019; Nosek et al., 2022), and conducting replications and large-scale collaborations (e.g., ManyBabies; Byers-Heinlein et al., 2020).
Together, the scientific community has implemented wide-ranging reforms for replicability. These reforms involve actions commonly used to repair trust, namely, increasing transparency, adjusting regulations, and shifting norms.
Good Research Practices and Public Trust
An open question is whether the public perceives reforms in psychology as trust-building. Empirical research regarding this question is still scarce and yielded unexpected results. In studies by Wingen et al. (2020) and by Anvari and Lakens (2018), participants learned about the low replicability of psychological studies, underlying reasons, and transparent research practices (see Table 1). Even though transparency is a commonly used mechanism to repair trust, these participants did not report more trust than those who only learned about the low replicability. These results are in contrast to conclusions from trust repair research (see above). Moreover, they differ from studies on trust in individual researchers. Self-corrections of individual researchers (and of politicians who are generally distrusted) increased trust in the respective person (for individual researchers, Altenmüller et al., 2021; Ebersole et al., 2016; Fetterman & Sassenberg, 2015; for politicians, Methner et al., 2020).
Table 1
Overview of Studies on the Influence of Replication Crisis Reforms on Public Trust
Variable | Studies
|
||
---|---|---|---|
Anvari and Lakens (2018) | Wingen et al. (2020; Studies 3–5) |
The present study | |
Sample | |||
Group size | |||
Low replicability groupa | 277 | 94–100 | 197 |
Trust repair groupa | 285 | 91–96 | 193 |
Setting | Online (Prolific Academic) | Online (MTurk) | Lab (n = 181), field (n = 142), online (n = 67) |
Trust repair intervention | |||
Repair mechanismb | Sense-making, transparency |
Transparency (Study 3) Sense-making (Study 4) Sense-making, transparency, relational/reparation (Study 5) |
Sense-making, transparency, regulation, ethical culture |
Content | Text briefly described… • the low replicability of psychological studies and explained it by poor and un-transparent research practices. • transparency in research practices and registered reports (no selective reporting, publishing non-significant findings). |
Text described… • the low replicability of psychological studies (Studies 3–5). • aspects of the open science movement (preregistration, open material, open data; Study 3). • hidden moderators or QRPs (Study 4). • explanations, open science movement, and recovered replicability (Study 5). |
Video showed… • the basics of psychological research, explained the term ‘replication crisis’ and some causes (hidden moderators, publication bias, QRPs). • changes in research culture: open science movement (preregistration, registered reports, open data, open material) and the embrace of its propositions in psychological societies, funding policies, job postings, and teaching. |
Style | Text | Text | Video |
Intensity | 160 words | 180–307 words | 10:57 min |
Outcome | |||
Main variable(s) | Trust in past and future research in psychological science | — | Trust in past and current psychological findings |
— | Trust in psychological science community | Trust in psychological science community |
Note. QRPs = Questionable research practices.
aTerms vary across studies. bMechanisms are based on Bachmann et al. (2015).
One key reason for the unexpected findings of previous work (Table 1) may originate from their materials. They used brief texts and focused on sense-making and transparency as trust repair mechanisms (omitting adjusted policies and norms). We consider that the materials did not provide sufficient information to give a comprehensive picture of reforms in psychology. In the current work, we address this limitation.
The Present Study
A pre-registered1 experiment tested whether comprehensive information about reforms increases public trust in psychology, compared to a control condition which only informs about the replication crisis and its causes. We created animated videos to comprehensively inform participants about the replication crisis and to explain how reforms intertwine with the crisis’ causes. A video (compared to a text) may also be more attractive and accessible to a public audience.
To measure public trust, we assessed two variables: trust in the psychological science community (i.e., researchers) and trust in psychological findings. We expect information about reforms to increase trust in researchers (H1) and in current (vs. past) research findings (H2), compared with information about the replication crisis and its causes only.
We reported all conducted studies, variables, materials, preregistrations, and conditions either here or in the supplementary materials (data, syntax, registration, and material can be openly accessed via the Supplementary Materials). All participants who completed our study were included in the analyses except if they met preregistered exclusion criteria.
Method
Participants and Design
We based our sample size on the smallest effect size of interest (SESOI). We considered a practically meaningful effect as smaller than medium-sized (Cohen’s d = 0.50; cf. Wingen et al., 2020, p. 457). As the present study is an initial test of whether one particular video changes participants’ trust, we set the SESOI at Cohen’s d = 0.30, which is in line with the SESOI in the work of Anvari and Lakens (2018).
An a-priori power analysis indicated that 382 participants would be required to detect an effect of d = 0.30 in a t-test (one-tailed) with α = .05 and a power of 1–beta = .90 (G*Power 3.1; Faul et al., 2009). Considering potential exclusions, we aimed at collecting at least 400 participants. We achieved a sample size of 419. We used different approaches (laboratory in a city center: n = 199, field: n = 143; online in a one-on-one meeting: n = 77) and incentives to recruit participants (for details, see Supplementary Materials).
Participants were randomly assigned to one of two conditions: replication crisis group or reform group. We included 390 participants in our analyses. Participants were excluded as preregistered: participants in the replication crisis condition who indicated previous knowledge of the replication crisis and wrote down any reform (n = 0)2; participants who indicated that they did not seriously fill out the questionnaire (n = 0); participants who spent less time on the treatment page than the respective video lasted (n = 16); and participants who failed the attention check (more than two incorrect answers; see below, n = 13). Our sample consisted of 235 women, 152 men, and two non-binary persons (one non-response on gender; Mage = 38.7 years, SD = 17.4, range = 18–85). Most participants (88%) had not heard anything about the replication crisis before.
Procedure and Materials
After giving informed consent, participants watched a video depending on their condition.
Videos
In the replication crisis condition, the video informed participants about the replication crisis (length: 7:27 min; see Supplementary Materials). It presented how psychological research is conducted, what is meant by replication crisis, and which causes explain low replicability (i.e., hidden moderators, publication bias, questionable research practices).
In the reform condition, the video consisted of the same video used in the replication crisis condition and additionally presented how several reforms address the causes of the replication crisis (length: 10:57 min; see Supplementary Materials). It included information on the open science movement (preregistration, registered reports, open data, open material) and the embrace of its propositions in psychological societies, funding policies, job postings, and teaching.
Measures
After watching the videos, participants indicated their previous knowledge of the replication crisis (yes, no; if yes, they wrote down what they knew). Next, they rated their trust in past research findings, their trust in researchers, and their trust in current research findings. Then, participants filled out an attention check. For exploratory reasons, we assessed the evaluation of reforms, reform suggestions, and perceived causes of the replication crisis (for details, see Supplementary Materials).
Trust in Researchers
To assess trust in researchers, we asked participants for their agreement with 11 items on a 7-point scale (1 = totally disagree, 7 = totally agree; Cronbach’s α = .90). Items were adapted from Anvari and Lakens (2018), Benson-Greenwald et al. (2023; Study 4), Nisbet et al. (2015), and Wissenschaft im Dialog (2021); e.g., “I trust researchers in psychology to provide societally relevant knowledge”. For a full list, see preregistration in the Supplementary Materials.
Trust in Psychological Findings
We measured trust in current and past findings with one item each using a 7-point scale (1 = very low, 7 = very high; adapted from Anvari & Lakens, 2018): “How much trust do you put in psychological findings from the time before the replication crisis?” and “How much trust do you put in current psychological findings?”.
Attention Check
To check participants’ attention to and understanding of the video, we asked them to indicate true statements (1 = true, 2 = false, 3 = was not addressed in the video). We used five statements, for example, “Hidden moderators are one explanation for the low replicability of psychological studies” (‘true’ is correct for both groups; adapted from Wingen et al., 2020, Study 4). For a complete list, see Supplementary Materials.
Finally, participants provided demographic information: age, gender, education, profession, German language proficiency, and political orientation (ranging from 1 = left to 7 = right). Participants were thanked and debriefed.
Results
Preregistered Analysis Plan
We tested whether any of the potential control variables differed between the experimental groups using ANOVAs (age, political orientation) and Chi-squared tests for independence (gender, educational level, previous knowledge of the replication crisis3). We found no significant differences between the two groups, Fs < 0.43, ps > .51, and χ2 < 8.19, ps > .14.
For hypothesis testing, we used a Welch’s t-test for trust in researchers (H1; one-tailed; α = .05; for a recommendation, see Delacre et al., 2017) and a mixed ANOVA with the experimental group (replication crisis; reform) as the between-subjects factor and trust in psychological findings (past; current) as the within-subjects factor (H2; α = .05).
To test the absence of the SESOI, we used the two one-sided tests (TOST) procedure (Lakens et al., 2018) using the TOSTER package in R (for Welch’s t-test; Lakens, 2017). Based on the SESOI (d = 0.30), we used an equivalence range of d = −0.3 and d = 0.3 (α = .05). Concerning H2, we only compared differences in trust in current research findings between the two groups since the TOSTER package does not yet include equivalence tests for mixed ANOVAs (Campbell & Lakens, 2021).
Preregistered Analyses
Table 2 shows means, standard deviations, and correlations among the three trust measures (see also Figure 1).
Table 2
Descriptive Statistics and Correlations for Dependent Variables
Variable | Condition M (SD) |
Total (N = 390) M (SD) |
Correlation
|
||
---|---|---|---|---|---|
Replication crisis (n = 197) |
Reform (n = 193) |
1 | 2 | ||
1. Trust in researchers | 5.0 (0.9) | 5.2 (0.9) | 5.1 (0.9) | — | |
2. Trust in past findings | 4.4 (1.3) | 3.7 (1.4) | 4.1 (1.1) | .18* [.08, .27] | — |
3. Trust in current findings | 5.0 (1.0) | 5.3 (1.2) | 5.1 (1.1) | .69* [.63, .74] | .20* [.11, .30] |
Note. For correlations, we included the 95% confidence interval in brackets.
*p < .001.
Figure 1
Mean Values for Trust Measures by Condition
Note. Error bars represent standard errors of means.
Trust in Researchers
Supporting H1, the reform group reported more trust in researchers than the replication crisis group, t(387.84) = 2.55, p = .006, Cohen’s d = 0.26, 95% CI [0.06, 0.46].
The equivalence test was not significant, t(387.84) = 0.78, SE = 0.09, p = .219, Hedges’ g = 0.26, 90% CI [0.09, 0.43].
Trust in Psychological Findings
Supporting H2, the reform group reported more trust in current (vs. past) research findings, compared with the replication crisis group, interaction effect, F(1, 388) = 35.18, p < .001, = .08.
Additionally, we found a main effect of trust in current (vs. past) research, F(1, 388) = 193.23, p < .001, = .33, indicating that participants reported more trust in current than in past findings, and a main effect of condition, F(1, 388) = 5.48, p = .020, = .01, indicating that the reform group reported less trust in research findings than the replication crisis group.
The equivalence test was not significant, t(382.43) = 0.72, SE = 0.11, p = .236, Hedges’ g = 0.20, 90% CI [0.03, 0.37].
Exploratory Analysis: Trust in Psychological Findings
As a post-hoc test, we ran a two-sided Welch’s t-test for trust in current findings. We expected similar results as for trust in researchers. Indeed, the reform group reported more trust in current findings than the replication crisis group, t(382.85) = 2.06, p = .040, Cohen’s d = 0.21, 95% CI [0.01, 0.41].
Discussion
Low replicability can violate public trust in psychological science. In response to the replication crisis, reforms have addressed transparency, norms of research practices (e.g., usage of power analysis, replication efforts), and formal regulations (e.g., journal guidelines, funders’ policies). These actions can be considered examples of trust repair strategies (see Bachmann et al., 2015, for a general framework). Yet, initial findings cast doubt on the idea that reforms of research practices can rebuild public trust (Anvari & Lakens, 2018; Wingen et al., 2020). Are reforms within the psychological science community indeed not trust-building in the eye of the public? We tackled this question by using animated videos to explain comprehensively how reforms intertwine with the causes of the replication crisis (see Table 1).
In line with conclusions based on trust repair research (Bachmann et al., 2015), our results suggest that watching a video on reforms can increase trust in researchers and trust in current (vs. past) findings, compared to watching a video that only informed about the replication crisis and its causes. More importantly, our study shows, for the first time, that the public positively acknowledges reforms in the psychological community—at least short-time and under certain circumstances. Notably, our study has limitations regarding internal validity and generalizability.
Limitations and Future Research
First, our two videos differed not only in their content (information about reforms: yes; no), but also in length, tone, and additional images. Thus, with the present data, we cannot test what characteristic of the reform video drove the observed effects. Nevertheless, one finding indicates that the information about reforms was crucial: Participants in the reform group indicated more trust in current findings than participants in the replication crisis group but also less trust in past findings. This suggests that participants who learned about reforms adopted a more differentiated perspective on psychological research, perhaps by being more cautious about past and more optimistic about current findings. However, if, for example, the reform video only evoked a positive and affirming affect, participants should have reported more trust in current and past findings. Yet, these interpretations are speculative and need further investigation.
A second limitation is the generalizability of our two videos, which we created in a particular style. Future studies should test whether the observed effects are generalizable to other video formats (e.g., interviews between journalists and researchers) and communication channels (e.g., articles, podcasts with multiple episodes). Still, the current videos have ecological value as they are in the style of explainer and science videos often found on popular video platforms.
A third limitation points to the durability of the observed effects. The present study investigated an immediate response after the video. Further research is needed to understand long-term effects and other consequences (e.g., attitude towards science, support of science-based guidelines; see Sulik et al., 2021, for the relationship between trust and support for pandemic measures as an example).
Fourth, our sample is not representative of the German population since we used opportunity sampling. Thus, our participants wanted to participate in a scientific study, indicating a science-interested attitude. This may limit generalizability. However, most Germans trust science (Wissenschaft im Dialog, 2021). Moreover, science-interested people likely engage with topics such as the replication crisis in their everyday life. Thus, our promising results are crucial for a highly relevant target group, namely, science-interested people. Consequently, the present study is valuable in understanding how to communicate the replication crisis.
Concluding Implications for Communicating the Replication Crisis to the Public
The present study broadens our understanding of public trust in the context of the replication crisis. Previous interventions in the style of very brief news reports (Anvari & Lakens, 2018; Wingen et al., 2020) suggest that communicating replication challenges can damage public trust in science, despite information on adjusted research practices. Our video-based intervention with in-depth explanations indicates a trust-building effect (at least, short-term). This is a promising signal for transparent communication of pitfalls and self-corrections as it is in other contexts (e.g., politicians; Methner et al., 2020). Nevertheless, taking the present and previous work together, communicating the replication crisis to the public is challenging. Communicators should address recipients in an engaging and comprehensible way when revealing the complex interplay between the replication crisis and the changing research culture in psychology.