Open-Data Movement: Share and Share Alike?

; Christopher P. Cannon, MD; Manesh R. Patel, MD


October 31, 2016

This feature requires the newest version of Flash. You can download it here.

Clearly, the community thinks that there is a problem with the way clinical trialists are sharing data.

Robert A. Harrington, MD: Hi. I'm Bob Harrington from Stanford University, and I'm joined by two of my colleagues: Manesh Patel, from Duke University; and Chris Cannon, from Harvard University. Guys, thanks for joining me here today.

Manesh R. Patel, MD: Thanks for having us.

Christopher P. Cannon, MD: Thanks.

Open-Data Movement

Dr Harrington: The topic we're going to [talk about] today is the open-data movement. Manesh and I were coauthors on a recent paper in the New England Journalof Medicine called ACCESS CV,[1] which was written in response to a proposal raised in draft form by the International Committee of Medical Journal Editors (ICMJE).[2] They are really concerned that clinical trialists are holding onto their data and not sharing it with the community.

I've worked with both of you for years; I've worked with Chris for a couple of decades. The three of us believe that knowledge and clinical trial data are meant to be shared, and we love collaborating with people who approach problems with a different perspective and a different set of methodologies.

I know you both agree with this, but clearly the community thinks that there is a problem with the way clinical trialists are sharing data. I'm going to look to you first, Manesh. Frame the problem, because you were one of the leaders along with Mike Gibson in really pushing forward this document called ACCESS CV.

Dr Patel: There are a lot of proposals we can talk about, but you are right. First and foremost, implicit in human research is patients. When they participate in research, they are doing so not only for the potential to help themselves, but often they're saying, "I hope that we can find something that will help others." Implicit in that is that there is a generation of knowledge and there is a sharing of that knowledge. We are all in favor of that.

The questions are, what are the goals and what are the issues at play? The two impetuses we should think about are the Institute of Medicine (IOM)[3] and the journal medical editors who together have said that we have not done a good enough job. We have not taken robust information from clinical trials and put it out into places where people might be able to ask other questions or look at the information to get new knowledge.

They suggested things that will probably form better ways to do this. They suggested that within 6 months after a primary publication in a major medical journal, all of the information that led to the primary publication [would be put] out into the public sphere. The IOM said that by 18 months the information from the entire trial should be available.

Some groups for some time have been trying to share data. The problem is that we all want to share data. The questions are, what are the goals and how do we do it?

Dr Harrington: Great framing.

Chris, there was an editorial[4] by the New England Journal of Medicine editors that really inflamed this discussion. Jeff Drazen, who is the editor-in-chief, had one single line that seemed to get everybody when he referred to people who wanted access to data that was collected by others as "research parasites." It was a bad choice of words and it in some ways derailed the conversation. What do you think Dr Drazen was trying to get at? As a trialist, in particular someone who leads trials, what do you think the issues are?

What Makes Data Sharing Challenging?

Dr Cannon: I share that our goal is to try to gather a lot of information, look at a new therapy, and then [look at that therapy] in different groups of patients. We have all done that and it has actually been successful. The tricky part is accuracy—making sure that we are interpreting data properly. Very often we have collaborations, and we will have differences of how the variables are analyzed or we use the wrong one. Very often you can get a wrong answer out of the data by just not knowing the complexities. That is one of the worries I have—people opening up to others [who would] unknowingly do an incorrect analysis. After it's published, people start asking questions, they worry, and it's all something that could have been sorted out carefully.

Dr Harrington: People have said, Chris, that the peer-review process will pick this up.

Dr Cannon: That is hard.

Dr Patel: It's harder than we think. Have you gone to Wikipedia?

Dr Cannon: I do sense that the trial group may be a little more closed and we have to invite others in. We do have to work together. You make the point in the paper that it has to be a collaboration with those who have done the work and understand the details with the outside ideas of "I would like to look at X or Y." Hopefully together it can work better and advance new ideas.

Dr Patel: I see it like there is this "want." We want to do this. There are three goals: replicating the result, asking secondary "offshoot" questions, and aggregating data so that you can ask other really interesting questions (which I think is the greatest good). But there are barriers: patient privacy issues and how do you do this—both of which can be overcome from a technical perspective.

Dr Harrington: You mean you can't just put an Excel spreadsheet out on the Web?

Dr Patel: That's right. What does the data variable mean? If you have spent 2 years of your life defining very clearly for people who are putting that information into the trial, and then somebody else misinterprets what that information even means—even if they analyze it the right way—then there is the analysis portion.

Finally, the last one is the multiplicity of questions—or not even questions—where an analysis could be done that would then lead to information. We are not opposed to [this]; we just want to make sure, given what we know, that there is some process in place and that maybe there are some hypothesis-based answers and questions being tested.

Dr Harrington: Chris, the group from Yale being led by Harlan Krumholz has created something called the Yale Open Data Access, or YODA, which is a great name for a project in this space. What Harlan has indicated is that they have a process set up for people to submit their data to them and then they can be the honest broker, if you will. To get to Manesh's point, people [would] apply and get that data or have somebody help them analyze that data. Is that a reasonable approach?

Dr Cannon: It's good, assuming that the "broker" understands the complexity of the data. Very often, questions come back: What is this definition of diabetes? For prior aspirin—over what period of time? There are all of these little details. Or, for example, cohorts of people were enrolled under high-risk criteria, so this study has more diabetics versus others because of the selection criteria. These kinds of analytic things have to be part of the group that is helping the outside person do the analysis properly.

Dr Patel: That is absolutely right. In fact, Harlan and the YODA groups led the field. For many years, he and others have been saying, "We should be sharing information." And he is probably right on this. He has been saying it and doing it—it's impressive. And of course there's, which groups like Glaxo and others have put together.

But for me, Bob, the proof is in the pudding. How much data has been shared? Three years in on those programs, there are about 200 or so data requests and maybe fewer than 50 manuscripts. Let's say it gets to 100 peer-reviewed papers that matter. It's good to have a lot of opportunities. It's good to have invested people.

Data Sharing From the ACCESS CV Perspective

Dr Harrington: Let's talk about this from the ACCESS CV perspective. You and Mike Gibson brought a lot of us who have spent years doing clinical trials together from around the globe. I sense from those discussions that everybody shares the belief that all of us should be sharing data and should contribute to that sharing. It's the mechanism by which you do this and how you do this, which is a bit more challenging. What is the ACCESS CV proposal?

Dr Patel: If you think about what I just said, one of the things is how many of these data you can actually share. If you take people who have worked at academic research organizations or conducted these cardiovascular clinical trials, there is a spectrum of thoughts. Putting everyone in the room has been really powerful.

The ACCESS CV proposal is in line with what the recommendations are for ICMJE and IOM, but there are three big parts to it. We want to share data, and the way we anticipate doing that is that for the primary publication, we agree to put on that we are going to be part of ACCESS CV. At the end of 12 months, we are going to work to get that primary publication out. And if replication needs to be done, we will work through mechanisms to replicate the primary finding of that trial.

The second proposal is that instead of 18 months, it's 24 months of key secondary manuscripts led by the trial team. But there's an important point in there. During the time that the primary trial team is doing key secondary papers, we are open to hearing from others who have good ideas.

Dr Harrington: As we always have been.

Dr Patel: As we always have been. We said we will work with them. I appreciate the terms "honest broker" and "learned intermediary." We are going to call it a "learned review group," where the principal investigators are involved because they have actually participated in the trial. It does not mean that people can't get the data; they are just going to work with the people who have generated the data. After 2 years, it will be open to requests. We do want to share data. The proposal is simply changing a few technical ways to do it.

Importantly, we also want to do it in a way where hypotheses are asked and data are shared in a confidential way so that patient information is kept separately.

Dr Harrington: Chris, from your perspective as a trialist, does that get to the point of balancing the needs of the trialists with the needs of the greater community?

Dr Cannon: It is definitely a wonderful step. You mentioned earlier the idea of aggregating some of the trials, and DCRI (Duke Clinical Research Institute) has been doing that. It's amazing when you put four or five of these already big trials together. You can go after things rarer than sudden cardiac death.

Dr Patel: Questions you couldn't answer in a single trial.

Dr Cannon: That is another wrinkle. Are there things that we can merge together that can generate some new ideas and opportunities?

Dr Patel: Hopefully ideas.

Dr Harrington: I want to thank you for joining me here to continue this conversation. I suspect that there will be a lot more opportunities to write about it and talk about it, and ultimately we all hope to institute some of these programs. I think with putting the stake out there and ACCESS CV, hopefully we will not just get discussion going but also create some new collaborations.

Dr Patel: Absolutely.

Dr Cannon: Yes.

Dr Harrington: Manesh Patel from Duke, and Chris Cannon from Harvard, thanks for joining me here today.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.