The Pros and Cons of Clinical Trial Data Sharing

; Michelle L. O'Donoghue, MD, MPH


August 03, 2016

This feature requires the newest version of Flash. You can download it here.

Editor's Note: Robert A. Harrington, MD, professor and chair of Stanford University's Department of Medicine, interviews Michelle L. O'Donoghue, MD, MPH, an assistant professor at Harvard Medical School and TIMI [Thrombolysis in Myocardial Infarction] Study Group investigator, about how academic organizations are reacting to recent proposals to increase data sharing and transparency. The interview took place at the 2016 American College of Cardiology (ACC) Scientific Sessions.

Robert A. Harrington, MD: Hi. I'm Bob Harrington from Stanford University, here at the ACC meeting in Chicago. I'm joined by my friend and colleague Michelle O'Donoghue from Harvard. Thanks for joining us today.

Michelle L. O'Donoghue, MD, MPH: Thanks for having me here today.

Dr Harrington: Michelle is an assistant professor at Harvard Medical School. She's an investigator in the TIMI Study Group. As one of the members of what I would call North America's leading clinical trials units, you have a lot of interest in this subject of data sharing, open data, and transparency. So, let's talk about it.

What I want to get, Michelle, is your perspective coming from a clinical trials organization that leads a lot of these studies. Meeting topics cover rapid publication or how many study authors are representative of the study group. [On the other hand], you think about what the patients agree to do. How are you and other leaders of the TIMI Study Group thinking about this?

Dr O'Donoghue: Well, it is a terrific topic to discuss because, as you've alluded to, it has become a heated debate. Should we share databases from clinical trials, and, if we do so, in what manner? Would we provide complete open access to the database? Would it be modified in some way? Would we keep the randomized treatment-arm assignments within that database as well?

There is the privacy issue. When patients enroll in clinical trials, to date, we have specified in the consent forms that the data would be shared only with a very small group of individuals. Now, we are talking about opening the doors to many more people. I think that there are pros and cons.

Pros and Cons of Database Sharing

Dr Harrington: Let's do some of the pros.

Dr O'Donoghue: There is a lot of information in these databases that we have not been making use of. From that perspective, [open databases are] a great opportunity to get more people to look at the data in a different way, think about questions that haven't been answered, and try to get to the bottom of things. That is a terrific opportunity. The more people who can put their heads together, the more likely we are to be able to advance medicine.

Dr Harrington: That, to me, is one of the keys. It would open the data to a diverse set of opinions, a diverse set of skills, and particularly a different set of methodologic skills. As you know, this debate is playing out in large part by the informatics community, which has said, "Free the data, and let us dive into it." I love the idea of getting methodologists to take a look and [the idea of] advancing the field, but there is the con side.

Dr O'Donoghue: Right. I think that this is where some people dig their heels in on one side of the issue or the other. I will try to walk the middle of the road a bit.

Dr Harrington: I will push you one way or another if I need to.

Dr O'Donoghue: We will see which way it goes, but I think, from the clinical trialist perspective, that there are a few different issues. Perhaps I will phrase it this way: If we were to share databases, what would be the primary intent? Is it to: (a) replicate the top-line analyses from randomized clinical trials of novel therapeutics? If that is the case, then the US Food and Drug Administration (FDA) at this point has open access to the database if [the investigational drug] goes for approval. So, the top-line results are already being validated, and the FDA is looking at the data every which way.

Dr Harrington: Frequently, when your group or our group leads a clinical trial, it's already being replicated by multiple parties: a sponsor (government or private), an academic research organization, and perhaps a second one.

We already have many replications of big trials, as you pointed out. And then, down the road, regulatory agencies are all doing their own analyses.

Dr O'Donoghue: Exactly. The challenge is that if you try to answer too many questions from a clinical trial dataset, spurious findings arise. We all have that hesitation. We look at prespecified subgroups, and even there we realize that we increase the possibility of finding a false-positive result. That hesitation to let people access the database without having proposals vetted is because if somebody has open access, they could ask a thousand questions all at once.

Dr Harrington: And pick out the one they like.

Dr O'Donoghue: Exactly. Some people have an agenda. There are people whose intent in looking at the database is not as earnest as we'd hope it would be.

Dr Harrington: There's definitely a group involved in what I call the "gotcha" moments of wanting to say, "You got this wrong. See, I knew it." Now, sometimes that's good because you did get it wrong, but people bring to the table their own agendas.

Dr O'Donoghue: Right, they could catch splashy headlines; and, unfortunately, those may not reflect the truth, which is what we want to get to at the end of the day. So, that's one part of it.

Complex Databases and Statistical Considerations

Dr O'Donoghue: The other part is that people always say, "Well, how can it be that the database is so complicated?" But there is truth to that. At the end of any clinical trial, we spend months trying to figure out the programming for the database. It's all carefully laid out. How do you censor a patient whose last visit was recorded on a particular date, but then you had a phone call with additional partial information? There are complexities to the raw programming piece as well.

Dr Harrington: That gets back to your earlier comment about replication. In my experience, you have two independent statistical groups marching down looking at the data, cleaning the data, and matching. You don't actually lock the data until you agree that it matches. Do we match on the number of events? Do we match on the type of events? Do we match on who had those events? It can take months to get to that point.

Dr O'Donoghue: That's exactly right. There is a complexity that shouldn't be underestimated. That goes again to the point of making sure that if we were to move in this direction, that proposals are vetted and discussed about how researchers would approach it from a statistical perspective. If you apply new censoring rules to the data, you can come out with different results. But if they weren't prespecified, that subtlety could get lost in the wash, and there could be a lot of confusing messages out there.

Dr Harrington: Some of that is very healthy in that it's hypothesis-generating data as opposed to hypothesis-driven findings. If it's hypothesis-generating, that's okay as long as you're framing it as [such] and not as, "This is a truthful observation that's emerged from these data." Those may be very different things.

Fairness in Time to Publish Top-line Results

Dr Harrington: The final point that I want you to delve into and where some of the sensitivity comes from in these conversations is timing of access. There is a group of investigators—that may number from dozens around the globe to thousands—who have all contributed to recruiting, consenting, and following patients and providing data. Part of why investigators do this is to engage in the academic process of diving into the data. Other people have said, "Well, if my ideas are better than yours, why do I have to wait for you to do your analyses?" Do you want to try to balance those?

Dr O'Donoghue: I agree that that is part of the discussion as well. There are a lot of people who pour their blood, sweat, and tears into these trials for a long period of time. It's investigators around the world. It's the fellows who we work with at the hospital. Working on recent trials, these poor fellows have been calling up sites, trying to get patients who might have stopped study drug to get back on. It's a tremendous effort, and I think that it is reasonable to offer individuals (whether it's investigators, fellows, or others) opportunities to publish the top-line results. It is not to say that other individuals down the road wouldn't have a similar opportunity, but if we are going to move towards a position of sharing the databases, I think it's reasonable to allow the primary investigators time to publish those top-line analyses first.

Dr Harrington: That's where a lot of the discussion has come from. The initial perspective piece from the medical journals said [open the database within] "6 months."[1] Boy, that's not a lot of time for a trial. For your group and my former group involved in IMPROVE-IT,[2] that was a 10-year effort. As Dr [Eugene] Braunwald has pointed out over and over, that was a tough 10 years of a lot of people being engaged. How long should the IMPROVE-IT investigators have primary access to the data?

The other thing I've always found interesting is that there is an assumption that the data are held by a small group. In fact, as soon as many of these trials are over, two, three, four, five academic groups have the data and are all working together to try to decipher what's in it.

Dr O'Donoghue: You are exactly right. IMPROVE-IT was a good example. It was a collaborative effort between a couple of different academic groups, including Duke and TIMI. There was a process: If there is interest to publish a certain paper, it is reviewed and discussed. Any reasonable proposal moves forward.

Dr Harrington: It's a good conversation to have, and unfortunately some of the language has gotten a bit heated around the whole topic. Every clinical trialist I have talked to says, "Absolutely, we want input, we want to share." The question is: What's the mechanism by which we do that? If we can move away from the "should we share" piece (because I think we all agree) and concentrate on how we do it, then the field will move forward.

Finding Ways to Share

Dr O'Donoghue: I think you are right. When we led off the discussion, I was saying that there were two ways that I think about it, and one of them is a question of the randomized treatment assignment, so the drug that's being evaluated. Part b is also that it is a rich dataset in and of itself. There should be a way to eventually share the database in a way that is independent of treatment arm to allow people to look into other potential associations. I think that there are opportunities and questions that the primary investigators may not ask that could be of value for others to explore.

Dr Harrington: That's a great point. When I was at Duke, we spent a lot of time creating both an ST-elevation acute coronary syndrome dataset and a non-ST-elevation dataset where the treatment was not of primary interest. We had 100,000 acute coronary syndrome patients in a common dataset, and we wanted to see if we could learn more about acute coronary syndromes. Could we learn about the patients? You are absolutely right that those are different sorts of questions.

Dr O'Donoghue: There have been a lot of conversations about big data and how to approach it. Some of that is still a moving target because you can ask a computer many more questions than before. We still have that concern about false positives, if certain signals pop up, but we won't really know until we start exploring that further and get a better handle on what opportunities might be out there in existing datasets.

Dr Harrington: I totally agree, and I love the idea of people with a different methodologic perspective coming to the table to say, "Let us offer you some of what we are doing in other types of large datasets." I think that could be helpful.

Dr O'Donoghue: I completely agree.

Dr Harrington: Michelle, thanks for joining us. My guest today has been Michelle O'Donoghue from Harvard Medical School, Brigham and Women's Hospital, and the TIMI Study Group. I am Bob Harrington from Stanford University. I hope that this has been an interesting discussion on open data, sharing of data, and clinical trial results.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.