Introduction

Recent commentaries argue that researchers bear an obligation to report individual genetic research findings to study participants.1, 2, 3, 4, 5, 6 Supporters of this obligation believe that disclosure honors the principle of respect for a person3, 4, 5 and the reciprocal nature of a research relationship.7, 8 Invoking the principle of beneficence, many argue further that it is in participants’ best interest to learn this information;4 the common suggestion that results should meet some test of clinical significance (varying from clinical through personal utility)4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 to warrant reporting supports this point. Proponents find confirmation for their position in empiric research with research participants, many of whom believe that such information is owed3, 9, 17, 18, 19, 20, 21, 22, 23, 24 and/or will have meaning in their lives.19, 23, 25, 26, 27, 28

Others contend that although the principles of respect for persons, reciprocity, and beneficence should be upheld in the context of research, they may neither be well served if results are disclosed nor denied if they are not disclosed.7, 29, 30, 31, 32, 33 Before unpacking the debates about whether or not participants can understand this information and make meaningful decisions about its receipt,29, 30, 31, 32 careful consideration is owed to the conditions under which the obligation to report arises in the first place.

Informed by qualitative research with investigators engaged in autism genetics research,33 this study assumed that decisions about exactly when genetic research results achieve some standard of clinical significance and warrant reporting involve a series of complex judgments that are neither limited to narrow interpretations of clinical utility and validity nor, to date, well understood.7, 33 To better understand a range of factors that might influence how researchers establish clinical significance and reportability, we conducted an international survey of researchers engaged in studies exploring the genetic basis of two distinct conditions: cystic fibrosis (CF), where current research seeks to understand modifiers of the CFTR gene, long known to cause CF,34 and autism spectrum disorders (ASD), where, in contrast, current research seeks to identify genetic variants that may contribute to the onset of this heterogeneous set of conditions.35, 36 While this research does not reflect the full spectrum of current genetic research, this approach enabled us to explore how different communities of researchers orient to research results generated in different disease contexts.

Methods

Sample, recruitment and data collection

To identify a population of investigators sufficiently engaged in CF and ASD genetic research so that they could provide an informed perspective on reporting research findings to participants, we generated a list of authors of scientific publications using keyword searches in Ovid Medline and EMBASE (2005–2008) (eg, gene*, or genom* and mutation, mutat*, cystic fibrosis, and autis*, or autism) and built on author lists of known research consortia.35, 36 We included only English language papers reporting or reviewing original scientific research in humans, relevant to CF or ASD genetics. To confirm relevance of the publications, we reviewed titles and abstracts and, where necessary, full papers. From the final set of articles, we generated a complete list of all of the authors listed on these articles (1223 ASD and 964 CF authors). We further limited this set of authors to those for whom we could identify publicly available email and postal addresses, generating a final sample of 877 eligible participants (418 ASD, 459 CF).

Following the Dillman-tailored method of mixed-mode survey design,37 we contacted potential participants five times over an 8-week period in spring 2009. Most contacts were by email with a link to the online questionnaire; the fourth contact was by mail and included a paper copy. We provided participants with an opportunity to receive an executive summary of findings as a non-financial incentive.

Survey instrument

The survey instrument was developed by the study team drawing on previous research and a review of the literature,7, 22, 33, 38, 39 and was pre-tested with 10 eligible respondents (who were excluded from the final sample). It included a non-experimental component consisting of 22 items: (i) three demographic questions about the respondent's primary role in research, professional training and gender, (ii) seven practice questions about barriers to care in the respondent's jurisdiction, and the role of the respondent's research team in providing information to participants, and (iii) 12 attitude questions about the respondent's perceived responsibilities toward research participants, beliefs about the potential for harm from provisional scientific information, and beliefs about the role of genes in ASD or CF.

The instrument also included a quasi-experimental component, using vignettes with a factorial survey design. This enabled us to maximize external validity by presenting respondents with true-to-life vignettes, and exploit the principle of random assignment to understand the independent effect of contextual and demographic characteristics, often difficult to achieve with non-experimental designs because of collinearity among these factors.40, 41, 42 Each vignette contained some combination of the attributes that appeared relevant to the adjudication of research results.7, 29, 30, 31, 32, 33 Each vignette told the story of a genetic research team that was considering how to manage a research finding and presented some combination of the two to six versions (or levels) of each of the 10 attributes. These attributes included features of the science (eg, replication status, robustness of the finding), the research team (eg, capacity of team to explain results) and the research environment (eg, availability of clinical services, research ethics guidance). Of the 10 attributes, seven had 2-levels, one had 3-levels, and one had 6-levels for a total of 2304 possible unique vignettes (27 × 3 × 6=2304). Using the fractional factorial technique to reduce the total possible number of vignettes to a number suited to our anticipated sample size and interest in main effects, we developed an orthogonal and balanced matrix, with 48 vignettes specific to CF and 48 specific to ASD.43 (Supplementary Tables 1 and 2 present a sample vignette and attribute list).

Each respondent received two vignettes: one that represented the disease context (CF or ASD) with which they were more familiar and one that represented the other disease context. A series of random numbers was generated and assigned to each vignette to establish the order and pairings of vignettes for each respondent. Each of the unique vignettes was assigned to a respondent once before any one vignette was assigned again.

Following the presentation of each vignette, respondents were asked to consider four judgments. Two of these judgments are the focus of this paper: (1) whether ‘the research team can be confident that this research finding is clinically significant,’ and (2) whether ‘the research team should ensure that information about this genetic variation (as specified in the vignette) is communicated to participants in whom it is identifiable, or their guardians.’ Data from responses to other questions are reported elsewhere.44, 45 Both the attitude questions in the non-experimental component of the questionnaire and the professional judgments elicited in response to each vignette were measured using 5-point Likert scales from strongly agree to strongly disagree.

Analysis

Likert scales were dichotomized into agree (agree or strongly agree) versus not (neutral, disagree, strongly disagree) and descriptive statistics were computed for all independent variables. We explored the factors influencing judgments (1) and (2) by first calculating unadjusted odds ratios (ORs) and 90% or 95% confidence intervals (CIs), as appropriate, for selected independent variables (ie, not the attributes embedded in the vignettes). Next, two separate multivariate models were estimated to assess the effect of experimental variables (ie, attributes embedded in the vignettes) and selected independent variables (Tables 3 and 4) on judgments (1) and (2). For the vignette attributes that reflected the scientific robustness of CF (factor C) and ASD (factor D) findings, respectively, specific comparisons were constructed for entry into the model. For both CF and ASD, we constructed a test of scientific robustness (CD1–CD2) and a test of intended versus incidental findings (CD1–CD6). For CF, we constructed a test of phenotypic severity (C1–C3) and for ASD, we constructed a test of methodological approach (D1-D3) (Supplementary Table 1 defines factors C and D). The final main effects models are reported with unadjusted and adjusted ORs and 95 or 90% CIs, as appropriate.

Table 3 Factors that influence researchers’ confidence in clinical significance
Table 4 Factors that influence researchers’ inclination to report results

Responses to vignettes were treated as separate observations; thus, the unit of analysis was the judgment provided (N-judgment) and not the individual participant (N-participant). To account for the unknown correlation between the two vignette responses from each participant, we used generalized estimating equations (GEE) with an unstructured correlation structure to fit the parameters of two generalized linear models (binomial, with logit link), and specified the regression models with (Huber–White) sandwich variance estimators for clustered data.46 We used the forward selection method to develop the models, removing variables where there was evidence of multicollinearity. As different selection procedures can lead to different final models, we re-introduced all variables of interest into the final model to test their significance at P≤0.1 (using the Wald statistic). Goodness of fit statistics are limited in the context of GEE models and are not reported in this paper.47 Due to missing data, seven participants included in the descriptive analyses were removed from univariate and multivariate analyses. All statistical analyses were completed using R (version 2.10.1, 2009) and geepack (version 1.0–17, 2010) software packages.48, 49

Finally, we generated an index to assess the degree to which each potential participant was involved in relevant ASD or CF genetics research, and to assess whether participants and non-participants differed in this regard. We scored each eligible genetics research publication, differentiating between highly relevant (eg, discovery research, research with human subjects or populations) and less relevant publications (eg, incidence/prevalence studies, case studies). Each participant was then assigned a final score depending on the number of publications multiplied by its relevance score (1 or 2). The final index ranged from 1 to 41.

Results

Response rates and characteristics of respondents

Of 785 eligible surveys, the final response rate was 44% (Supplementary Table 3), with the majority of questionnaires (81%) completed online. Participants were almost two times more involved in more relevant research than non-participants (OR=1.8, CI (1.2, 2.7)). The sample was evenly split among ASD and CF researchers (49%, 51%, respectively). Other participant characteristics are described in Table 1.

Table 1 Summary of participant characteristics

General beliefs about reporting research results and provisional knowledge

A total of 80% of researchers agreed that, in general, individuals in whom a genetic variation is identified should be informed of this finding when it is judged to be clinically significant. In contrast, only 23% of CF researchers and 15% of ASD researchers agreed that individuals in whom a genetic variation is identified should be informed of this finding when its clinical significance is uncertain. The majority of CF and ASD researchers (66 and 64%, respectively) also agreed that provisional scientific information is potentially harmful for research participants (Table 2).

Table 2 Summary of participant's general beliefs

Multivariate model 1: what influences researchers’ confidence in the clinical significance of genetic research results?

Statistically significant vignette attributes

First, characteristics of the science specified in each vignette influenced researchers’ confidence that a result was clinically significant: (a) less well-replicated findings (ie, those that were replicated by the same group or those that had not been replicated at all compared with those replicated by an independent research group) were 45% less likely to engender confidence (OR=0.55, 95% CI (0.3, 0.9)), (b) less robust findings (ie, variants found in non-coding regions that conferred only 1.5-fold increased risk for CF or ASD) compared with more robust findings (variants found in known coding regions that conferred fivefold increased risk for CF or ASD) were half as likely to engender confidence (OR=0.5, 95% CI (0.3, 0.8)), and (c) incidental findings (ie, genetic risk for multiple sclerosis in the context of CF or ASD research) engendered 65% less confidence than an intended finding (OR=0.35, 95% CI (0.2, 0.7)). In addition, (d) ASD findings engendered 35% less confidence than CF findings (OR=0.65, 95% CI (0.5, 0.9)) (Table 3). Other scientific (eg, phenotypic severity, methodological approach) and non-scientific (eg, nature of researcher-participant relationship, availability of clinical services for index condition) attributes were not statistically significant influences (data not shown).

Statistically significant independent variables

General beliefs and characteristics of the researchers themselves were also significantly associated with judgments of clinical significance. ASD researchers were 40% less likely than CF researchers to be confident in a given finding (OR=0.6, 95% CI (0.4, 0.96)), and those with a clinical interpretive role were two times more likely than those without this role to be confident in the clinical significance of a finding (OR=2.2, 95% CI (1.4, 3.4)). Researchers who endorsed the general belief that clinically significant findings should be reported back to individuals were 2.8 times more likely than those who did not endorse this general belief to feel confident that the hypothetical finding was clinically significant (OR=2.8, 95% CI (1.4, 5.8)) (Table 3).

Multivariate model 2: what influences researchers’ inclination to report genetic research results to participants?

Statistically significant vignette attributes

First, characteristics of the science specified in each vignette influenced researchers’ inclination to report results to research participants: (a) researchers were half as likely to support reporting less-robust than more robust findings (OR=0.5, CI (0.3, 0.8)), (b) 40% less likely to support reporting incidental rather than intended findings (OR=0.6, 95% CI (0.3, 0.96)), and (c) half as likely to support reporting ASD findings than CF findings (OR=0.5, 95% CI (0.4, 0.7)) (Table 4). Characteristics of the research environment also influenced researchers’ inclination to report research results. Specifically, they were 40% less likely to support reporting where research teams lacked the capacity to explain research results and provide medical advice to participants than they were when teams were described as possessing such capacity (OR=0.6, 95% CI (0.4, 0.8)). With respect to research ethics governance, researchers were 40% less likely to agree that a result should be reported when research teams were said to use consent forms that were not specific about how to manage result reporting than when consent forms required that clinically significant findings be reported (OR=0.6, 95% CI (0.4, 0.8)) (Table 4). As above, other vignette attributes were not statistically significant influences (data not shown).

Statistically significant independent variables

General researcher characteristics and beliefs were also significantly associated with the disposition to report the findings identified in each vignette. ASD researchers were 40% less likely than CF researchers to support result reporting (OR=0.6, 95% CI (0.4, 0.9)), those with a clinical interpretive role in research were 1.5 times more likely than those without this role to support result reporting (OR=1.5, 90% CI (1.03, 2.1)), and those with a statistical role were 30% less likely than those without to support reporting the result (OR=0.7, 90% CI (0.5, 0.99)) (Table 4). Researchers who endorsed the general belief that clinically significant findings should be reported to individuals were five times more likely than those who did not endorse this belief to support reporting the vignette finding (OR=4.9, 95% CI (2.7, 8.9)) and those who endorsed the view that provisional scientific information is potentially harmful for research participants were 60% less likely than those who did not share this view to support reporting results (OR=0.4, 95% CI (0.3, 0.7)) (Table 4).

Discussion

Our findings endorse the view that clinical significance is a key criterion on which to base a decision about reporting genetic research results. This view has been advanced by many in the research ethics community, and informs much of the guidance from Institutional and Ethics Review Boards. This alignment of beliefs between ethicists and researchers may prove helpful where research findings come neatly labeled as clinically significant or insignificant – as happens on occasion. But it is typically the role of research to ascertain significance, whether characterized narrowly as clinical utility or more broadly to include personal utility. For any characterization along this spectrum, a standard of clinical validity is required and, as suggested herein, even this benchmark is contested. In reality, then, researchers must adjudicate both clinical significance and reporting obligations simultaneously in contexts that, by definition, involve scientific uncertainty.

Beyond confirming the importance of clinical significance as a decision-making criterion in principle, our findings provide insight into the factors that influence how researchers approach this putative reporting obligation in practice. Clinical significance and reportability indeed depend on expected parameters related to scientific rigor. Specifically, less sound results were less likely to be judged clinically significant or reported, and unsurprisingly, given the lack of clarity about the genetic underpinnings of the ASD, less confidence in the significance of research findings was asserted in this disease context than in the context of CF. Although this may also be a function of limited clinical utility of current genetic research results in the context of ASD compared with CF (untested in this study), we show that these judgments also depend on non-scientific factors specific to the researchers and the contexts in which their teams operate.

A key feature of the research context that influences decisions to report is the team's capacity to report results properly. It has been recommended that reporting genetic findings in a research context be done by a health professional with training in human genetics and counseling and accompanied by accurate written information.48, 50, 51 Encouragingly, our respondents were less inclined to agree that results should be reported in contexts where research teams lacked this capacity. Unless necessary clinical supports are in place to explain research results clearly, the results are better left in the laboratory.

A second feature of research context that influences judgments about result reporting is research ethics guidance. We found that teams were less likely to report findings where consent forms were nonspecific than when they indicated that clinically significant results were to be reported. This would seem to be a sensible association, given the importance of adhering to the terms of the research relationship as specified through the consent process. However, this association also suggests that greater reticence to report is the default position. As Institutional and Ethics Review Boards increasingly require that research teams offer the disclosure of clinically significant results to research participants,5, 6 they may be challenging researchers’ potentially appropriate reticence.

Researchers’ judgments were also influenced by their professional roles and beliefs. Informed by qualitative work, we suspected that their interpretation of emerging findings might in fact be tied to the evidential assumptions used to generate them.33 Interpretation in turn, would influence what is owed. Expert respondents, bringing different disciplinary training and roles to bear, indeed viewed the significance and reportability of research results differently. Respondents with a clinical role assigned greater significance to, and were more inclined to report, a given finding than those with a non-clinical role. In addition, ASD researchers were more conservative in their judgments of clinical significance and reportability than CF researchers, irrespective of the disease context in which the hypothetical finding was considered. Recent qualitative work argues that judgments about the meaning of research results relies on more than simple tests of analytic or clinical validity.7, 33 Miller et al33 found that respondents’ judgments of significance relied on contests about appropriate evidentiary standards and on fundamental assumptions about the etiologic role of genes in the disease in question. Our data endorse these qualitative findings and suggest also that different research communities may form distinctive ‘cultures’ with respect to interpreting and reporting results, highlighting the challenge of achieving uniformity in research ethics guidance and practice.

Finally, we found that researchers’ belief that clinically significant results should be reported to individual participants influenced their assessment of the significance of the result itself. In short, the ethical imperative to disclose appears to be bolstering researchers’ judgments of scientific relevance, perhaps reflecting the traction that this ethical imperative has gained in the scientific community. Yet although it is appropriate for research ethics to govern the conduct of human subjects research in pursuit of scientific knowledge, it seems inappropriate for it to alter the calculus of scientific judgment. Not only is this a surprising influence, it is not without the possible consequence of over-attributing meaning to research results, in turn triggering potentially unwarranted disclosures.

Two key messages emerge from our findings. The first is that researchers endorse clinical significance as a key criterion for deciding upon an obligation to report a research result to the individual in whom it has been identified. The second is that although researchers endorse this criterion in principle, judgments about what, in practice, actually (a) constitutes clinical significance and (b) warrants reporting to participants are complex and context-specific. This has been demonstrated qualitatively33 and now, quantitatively. These findings call into question the apparent simplicity of a clinical significance criterion and imply that the criterion itself, and judgments about when it is met, may be too variable and idiosyncratic to guide a universally actionable ethical imperative.

In conclusion, it is perhaps unsurprising that a complex web of contextual factors are brought to bear when researchers are asked to adjudicate their own provisional data. It may be instructive to recall that what fueled the evolution of the health technology assessment discipline, in part, was the need for impartial evidence evaluation.52 In the realm of clinical research more generally, it is a comprehensive process of health technology assessment – not the researchers themselves – that typically decides the real-time readiness for uptake of new knowledge. Although it is beyond the scope of our findings to advocate for an analogous process for adjudicating the clinical readiness of genetic research results, the potential applicability of this model of evidence evaluation may warrant consideration in the context of this debate.

Limitations

Our findings can be viewed as only exploratory in nature because of the complexity of the design. Specifically, given the number of attributes and embedded levels that we sought to test, our lower-than-expected response rate limited our power to detect all potentially relevant effects and limited our ability to interpret two-way interacting effects. In addition, the scientific and clinical vignette variables that were assessed for their influence on judgments of clinical significance were more relevant to judgments of ‘clinical validity’ than ‘clinical utility.’ This reflects the reality of much genetics research, but limits the generalizability of our findings to other and perhaps more complicated constructions of clinical significance.