'Data Gone Wrong': Unreproducible Cancer Genomics Studies

Roxanne Nelson

June 30, 2011

June 30, 2011 — Cancer research is increasingly being defined by genomics, and advances in technology have allowed researchers to identify candidate genes as prognostic, diagnostic, and therapeutic biomarkers for different subtypes of tumors. Genomic platforms also allow for the prediction of therapeutic response, and with it the promise of personalized therapy.

Proteomics and genomics have generated quite a bit of excitement in the scientific community because they bring a new level of complexity to the cancer field. However, this emerging field of research has also brought concerns about validation and reproducibility.

Perhaps the most publicized and glaring case of "data gone wrong" involved Anil Potti, MD, an oncologist and genomics researcher who was forced to retract 4 papers from peer-reviewed journals because of results that could not be reproduced. As previously reported by Medscape Medical News, Dr. Potti's saga involved not only concerns about the validity of his research, but allegations of misconduct. He eventually resigned from his positions at the Duke University School of Medicine and the Duke Institute for Genome Science and Policy in Durham, North Carolina.

Dr. Potti's research was directed at developing gene-expression signatures that predict responses to various cytotoxic chemotherapeutic drugs. The goal was to identify characteristics of individual patients that could be matched with specific. His published papers reported that his signatures had the capacity to predict therapeutic response, but the experiments could not be reproduced and the signatures could not be validated.

"One issue that is coming out is that these papers did get past peer review. Does that mean peer review is broken?" asked Keith Baggerly, PhD, professor in the Department of Bioinformatics and Computational Biology at the University of Texas M.D. Anderson Cancer Center in Houston. "No not really, and peer review is still one of the best methods we have," he answered.

"The types of errors that we are talking are not those that peer review would have caught," he told Medscape Medical News. "The type of data analysis required to find those errors is not really feasible in peer review."

Feasible to Check?

Dr. Keith Baggerly

Peer review is not a "stamp" that the data are correct. "They may look plausible," Dr. Baggerly explained, "but before anyone uses this information in a clinical setting, its going to require some level of independent replication and verification."

Dr. Baggerly and colleagues attempted to reproduce Dr. Potti's findings at the request of investigators from the M.D. Anderson Cancer Center, who were interested in using the research pioneered by Dr. Potti and his group. However, they were unable to replicate the results. Incomplete data and documentation made the task difficult, and required the use of "forensic bioinformatics," the art of using raw data and reported results to recreate what was done to obtain the results, rather than just retest the model.

"Ideally, this is a craft that should not be required," said Dr. Baggerly. "The methods section should cover this. Empirically, that is not the case."

The situation becomes worse when it starts being applied to high-dimensional situations such as omics signatures. "The reason it becomes worse, in my view, is that I believe that our intuition about what makes sense in high dimensions is empirically very poor."

This means that to work with and trust high-dimensional signatures, "we need to have a fairly precise idea of how it is that they were assembled," he added.

To use "genomic signatures" as biomarkers, we need to know they've been assembled correctly. What is being discussed now, he pointed out, is when these explicit checks, which attempt to go back and reproduce the data, should be implemented.

When do we say that this has to be right and when do we try to confirm it, Dr. Baggerly asked. Currently, these questions are linked to a number of issues: "Are we going to spend a lot of hours and resources on this, or are we going to proceed to clinical trials? In other words, are we extending monetary resources or the human variety?"

If we decide that "this really has to be right before we move ahead, then the level that was sufficient for initial publication in a journal may not be enough," he said.

IOM Meeting

On March 30 and 31, the Institute of Medicine's (IOM) Committee on the Review of Omics-Based Tests for Predicting Patient Outcomes in Clinical Trials held a meeting, which was spurred by the circumstances and events related to work the conducted by Dr. Potti and his colleague, Joseph Nevins, PhD, from the Duke Institute for Genome Sciences and Policy.

According to background information supplied by Duke University, the IOM committee's work came about after a request to IOM president Harvey Fineberg, MD, on July 21, 2010, from Victor J. Dzau, MD, chancellor for health affairs at Duke University, in consultation with Harold Varmus, MD, director of the National Cancer Institute. The request was for a "full and independent review of the issues surrounding genomic predictors of chemotherapy sensitivity, as well as to provide guidance on the appropriate scientific approaches to this rapidly evolving area of science and medicine."

Duke University notes in their perspective letter that "the foundational conclusions for the chemotherapy sensitivity predictors were compromised by corruption in validation datasets that invalidated those conclusions." The clinical trials were subsequently halted, and Dr. Baggerly and his colleague Kevin Coombes, PhD, were instrumental in uncovering the problems with these chemotherapy sensitivity predictors.

The goal of developing universal standards for omics research remains.

However, Drs. Baggerly and Coombes point out that the "goal of developing universal standards for omics research remains."

Dr. Nevins was one of the panelists at the IOM meeting. In his talk, he described the methodology used in the research conducted by he and Dr. Potti, and respond to the scientific criticisms and the nature of the data corruption that led to the retraction of the papers and the termination of the trials.

Dr. Nevins noted that despite the criticism of their results, they persisted in their work for several reasons. He pointed out that he did not recognize that a critical flaw in the research effort was one of data corruption, an apparent manipulation of validation data.

"The reason it was not recognized earlier was my focus on what I believed to be the fundamental criticism — a question of accommodating distinctions in cell line and tumor data," he said.

He added that they believed they had addressed this "fundamental issue with multiple validation results beyond the initial papers."

Despite the negatives, there are lessons to be learned. "Data corruption of the form we experienced can happen, but is not something one anticipates," he explained. "It is best to look at this and think how we can ensure the integrity of data in the validity of results."

One way of doing this is to make use of systems for tracking analyses in projects, Dr. Nevins pointed out. Another is to ensure that all data, methods, and software are made available in publications, along with the full integration of appropriate expertise in the work.

That we didn't ensure that all data methods and software were available in the publications was "a mistake."

There is some control during the publication process, explained Dr. Nevins. "I understand the difficulties of journals policing this, but it's doable. And it has to happen. People just can't submit papers if the data are not available, and papers can't be accepted if the data are not available."

Dr. Baggerly was also a panelist at the IOM meeting, and he offered recommendations from he and Dr. Coombes. What are needed, he said, are data, metadata (clinical information, run order, design information), evidence of provenance, the code, auditability before trials begin, and reproducibility.

Investigators need to think of reproducibility as a goal from the outset.

"Investigators need to think of reproducibility as a goal from the outset," he said, and journals need to ask and check for code and data deposition. Journals also need to be prepared to host code and clinical data.

In addition, agencies need to provide data repositories, and they need to check for data and code availability at renewal time. They need to budget for reproducibility audits, Dr. Baggerly explained, and institutions need to help with training and infrastructure.

Dr. Potti Resurfaces

As reported by Medscape Medical News, Dr. Potti was suspended by Duke in July 2010, amid concerns about his research and whether or not he had lied on a grant application. When questions about Dr. Potti's credentials became public, the American Cancer Society suspended payment of a $729,000 grant that had been awarded to him to study lung cancer genetics.

According to Retraction Watch, Dr. Potti is now employed by the Coastal Cancer Center, an oncology practice with 4 offices in South Carolina and 1 in North Carolina.

In addition, it appears that Dr. Potti is trying to polish his tarnished image. In April, the Duke Chronicle reported that he had hired a reputation manager to "push down" unfavorable content on Internet search engine results.

The Duke Chronicle notes that this "raises ethical concerns." At least 5 Web sites have been registered that combine Potti's name in different arrangements: AnilPotti.com, AnilPotti.net, DrAnilPotti.com, PottiAnil.com, and PottiAnil.net. Although the content on these Web sites appears to be factually correct, none of them mention anything about the long and tumultuous saga of Dr. Potti's research, paper retractions, or misrepresentation.

Attempting to influence search engine results is not unethical, said Sheldon Krimsky, an expert on medical conflicts of interest and a professor at Tufts University in Boston, Massachusetts, in the Duke Chronicle article. However, ethics do come into play if a physician attempts to change the public record.

According to its official Web site, Coastal Cancer Center conducts clinical trials. There is no news on whether Dr. Potti will be involved in any of these trials.