To Believe in Science Is to Believe in Data Sharing

John Mandrola, MD


February 02, 2016

The big news thus far in 2016 comes not from a study but from a possible change in how clinical science gets done. This may be a before-and-after moment in medicine.

The International Committee of Medical Journal Editors (ICMJE) has drafted a proposal[1] requiring that authors of clinical trials share individual patient-level data as a condition of publication. Make no mistake, as gatekeepers of medical evidence and arbiters of publication in top-tier journals, this group wields enormous power.

If enacted, these data-sharing rules would disrupt the status quo. Consider that when you read a journal article now, you see the tables and figures, but patient-level data remain inaccessible. Trialists and sponsors of the study control the raw data so they decide when, or if, a data set is reanalyzed. They choose the questions and the timing of subsequent studies.

This new proposal gives trialists a 6-month head start after which other scientists can gain access to some of the data. (More on how much data is shared later.)

Do the benefits of sharing data outweigh its risks? How you answer that question depends on the position of your lens.

Data Sharing: The Practitioner's Perspective

I see four major benefits:

First, data sharing will improve the problem of replication (or lack thereof). The foundation of science rests on replication, but there is little incentive for researchers to do it. Why would they? Journal editors publish original research.

But replication is critical. Here I point you to the latest effort of the restoring invisible and abandoned trials (RIAT) initiative. Using access to raw data, this group of researchers reanalyzed[2] SmithKline Beecham's Study 329[3] and found that, contrary to the original trial conclusion, paroxetine and imipramine were no better than placebo for the treatment of depression in adolescents. Tell me that reanalysis would not have been useful 14 years ago when Study 329 was first published.

Second, increased transparency of data would bolster confidence in clinical science. We need more of that. "Many published research findings are false and exaggerated, and an estimated 85% of research resources are wasted," writes Stanford (CA) professor John Ioannidis in PLOS Medicine .[4]

Third, data sharing would enhance collaboration. In a Tweet, Dr Harlan Krumholz (Yale University, New Haven, CT) reminded us to "think about the Human Genome Project with regard to open science. Heroes like the NIH director [Dr Francis Collins] contributed so others could do better research."

Fourth, data sharing would increase the knowledge generated from clinical trials. Last week, for example, a group of researchers extracted data from 30 cohort studies with more than four million subjects and found[5] convincing evidence that atrial fibrillation conferred more cardiovascular risk to women. Don't we want more of that?

Data Sharing: The Research Scientist's Perspective

To TCTMD,[6] Dr Gregg Stone (Columbia University, New York), a leading cardiology investigator, said that "everybody is for transparency.…These are important data sets that affect patient care," adding that sponsors and investigators have typically spent years on these studies and "know the data sets intimately, and there are usually a lot of sophisticated considerations and nuances that go into analyzing the data. To just put a data set out there without having that background…is fraught with problems and may lead to inaccurate analyses and interpretations."

In the same article, trialist Dr Chris Cannon (Harvard Clinical Research Institute, Boston, MA) noted the merits of data sharing but then focused on the rights of a sponsor "that has paid often hundreds of millions of dollars to do large studies." He argues that to hand over data to anyone who asks would be to "hand over all of that investment."

By email, Dr Prashanthan Sanders (University of Adelaide, Australia) wrote that he strongly agreed with the notion that raw data should be made available to verify and collaborate with the investigators undertaking the study. But he strongly disagreed that data should be made freely available within 6 months. Sanders proposed 5 years as a more acceptable time limit. His concerns were pragmatic: first, without incentives, would researchers do the work it takes to generate the data? And second, would this rule lead researchers to delay major publications until numerous papers were ready to go to?

Does The Proposal Go Far Enough?

Dr Vinay Prasad, a cancer researcher, assistant professor at the Oregon Health and Sciences University (Portland), and coauthor of Ending Medical Reversals: Improving Outcomes, Saving Lives, strongly supports data sharing but is concerned the proposed rules require too little sharing. In his comment on the ICJME website, which he shared on Twitter, Prasad called attention to wording in the proposal that would limit the amount of data to be released.

He wrote that "the use of the phrase 'underlying the results presented in the article' is a major error. This phrase will be used to limit data sharing to the explicit information used to generate Kaplan-Meier plots and basic demographic data; the full set of covariates and corollaries need not be shared." This is a problem, Prasad said, "because the latter information is vital to reanalysis.…Would patients consent to a trial where only some data is shared?"


This is a tough one. It would be easy for me, an outsider to the research community, to take the high road: Share your data because it is the right thing, the ethical thing, to do. I will resist that urge, although I will admit that it was easy to get caught up in the righteousness of the initial reactions to this announcement on social media.

If society desires good evidence, society must reward those who generate it. Drs Cannon and Sanders were courageous to speak of the reality of rewarding the hard work of research. To not address the issue of outside scientists benefiting from the hard work of others is foolish.

If this proposal goes through without reform in how science is valued, fewer people will write grants, gather data, sit on painful conference calls, and revise manuscripts. Science will not do itself. The what-we-value issue is now one of the core problems in mainstream medicine. Doctors who do the hard work of seeing patients, taking calls, and making the risky decisions are valued less relative to those doctors who do administrative work. Look for devalued researchers to do what devalued physicians are doing—leaving the profession. Fewer dedicated researchers is not what society wants.

On the other hand, at this point in my career, I struggle to suppress cynical thoughts. I know researchers are mostly good and honest, but medical evidence has a bit of a cycling problem. In the same way that doping casts doubt on all performances, honest or otherwise, I often find myself looking at a positive trial and thinking: "That's a good result, but can I believe it?" How many negative studies looking at the same issue were not published? Are the authors, the keepers of the data sets, telling the whole story? Were the peer reviewers soft because of "friendship acceptance"? As in athletics, medical research offers great treasure but only for great results.

So…if a (more) neutral group of researchers independently analyzed a study's data and came to the same conclusions as the original group, would that not be a good thing—for everyone? If you had a great result, wouldn't validation make it greater?

I understand "neutral" can never modify "researcher," for everyone has bias, but perhaps truth lies in the space between two (or more) biased looks at the same data set. And even if an outside researcher found reasons to temper enthusiasm of the original group, couldn't going slower, and potentially making fewer mistakes, actually lead to faster progress? In medicine, undoing mistakes is no small thing.

Although this proposal will surely force scientists to behave in a way that goes against their self-interest, at least in the short term, I see it as a move in the right direction.

Open data would make it easier to believe. And we need to believe in science.



Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.