Companies Are Profiting From Health Data—Patients Should Too

Ravi B. Parikh, MD, MPP


January 27, 2021

My patient, a woman with metastatic bladder cancer who was about to move to a different part of the country, was perplexed: "Can't you just give me my records on a flash drive?"

Unfortunately it wasn't that easy. Forms needed to be routed through administrators to the medical records department. Even if things went well, radiology images would be sent by couriered mail, only to arrive several weeks later.

She left confused and angry, and her new oncologist may not have known anything about her case by the time she arrived.

Ironically, I came across this woman's medical record later that month while screening potentially eligible patients for a trial. Even after she left our practice, I probably had earlier and broader access to her medical record than she did.

Institutional Hurdles

It's no secret that accessing one's own medical data can be incredibly difficult. Part of the reason is that health data exist in so many forms, ranging from unstructured notes to terabytes of imaging and pathology data. While the logistics of obtaining even a fraction of data are challenging — after all, you are lucky to get a single CT scan on a 32GB flash drive — institutions often make it difficult to even access that data. Although there are "Open Notes" movements to give patients access to their data, unrestricted patient access is still relatively rare. Many hospitals require patients to fax a form if they want to access their data.

Perhaps that's why most patients don't even try. According to data from PricewaterhouseCoopers, only 14% of patients get their medical records electronically from their physicians' offices. Thirty percent say they don't understand why they would need to do so.

The fact is, many people have relatively easy access to patients' data — except for patients themselves.

A Brief History of Patient Data

Patient data are now bought and sold by a variety of companies.

More than 40,000 Americans per year participate in a clinical trial, and these trials, usually run by large pharmaceutical companies, produce troves of patient data. While this is a fraction of the overall population of patients with cancer, these datasets have immense depth, often consisting of clinical, demographic, and genetic and molecular data from blood samples. Furthermore, specimens from these trials are often used for years after the trial is over.

Cancer registries also collect massive amounts of patient data in order to understand diseases and their treatment outside of clinical trials. To develop registries, trained registrars scour patient records, entering 1s and 0s for hundreds of variables. Through this arduous process, hundreds of research questions on such topics as non–evidence-based care and racial disparities — questions that would otherwise be unable to be asked — can be answered.

Even though patients do not have routine access to these data, at least with clinical trials and registries the argument can be made that most of the data are used for research. Yet, the advent of the electronic health record has released patient data from these registries and clinical trial repositories — and such data are increasingly at a premium for more than research. Pharmaceutical companies use patient data to judge their drugs' effectiveness after approval, perhaps to suggest new markets for their drugs or to generate "in silico" control arms for phase 2 studies of novel agents. With the advent of big data and artificial intelligence, software companies pay significant dollars for real-world data to train their algorithms.

For this reason, proprietary companies that curate and sell patient data to largely for-profit companies have exploded. The sellers of the data are often the same ones administering the electronic health records. Clinical trials and registries simply cannot match the speed and scale with which these companies can generate data.

Granted, the data are de-identified. De-identification has been used to justify the dramatic expansion in patient data collection, which is usually unknown to patients. There are downsides, however. Studies suggest that more than 99% of Americans would be able to be re-identified using standard data available in such real-world datasets. In other words, the data are not as anonymized as patients may think.

The net effect of these vast data-gathering efforts has been "a mad race for remarketing data without traceability and provenance to where the information goes," according to Cynthia Fisher, the founder and chairman of

The Marketplace for Medical Information

Large organizations — hospitals, insurers, and pharmaceutical companies — can purchase health data about patients in much the same way that they might purchase consumer location data from companies like FourSquare, with relative ease.

Patients, it was believed, should have equally easy access, which is why the 21st Century Cures Act mandated, without charge, access to all clinical notes as part of a patient's medical record. Unfortunately, the COVID pandemic has been used as justification to ignore proposed deadlines as part of the act, thereby "killing the mandate by delay," according to Fisher.

There are many reasons that patients should have access to their own data, beyond the obvious fact that it would allow them to be better informed about their illness. But perhaps the biggest reason is that patients should be allowed to participate in the research process — the very same process that companies are profiting from without their knowledge.

Imagine if, soon after an X-ray or a laboratory test, the result was transferred electronically (similar to a banking transaction) to a secure repository, where the patient could download and transfer that data to relevant parties. Patients could not only coordinate their own care but also potentially share a profit in the billions of dollars that are generated behind their backs using real-world patient health data.

For example, if patients with cancer had their genetic and molecular information at their fingertips, they could decide to share that data directly with pharmaceutical companies or principal investigators conducting research. Participation in cancer or genetic registries probably would increase significantly, and racial and gender disparities in clinical trial and registry data could be corrected as broader sources of data were made available.

Giving Control to Patients

A movement to accelerate patients' ownership of their own data is already underway. There are several burgeoning efforts to crowdsource patients' data for research purposes. Count Me In, for example, allows patients to give permission to submit their tumor, blood, and saliva samples for research purposes; there are projects in multiple different cancers. All of Us is a National Institutes of Health–sponsored patient research partnership in which interested patients share DNA samples for research purposes. There are also opportunities for researchers to propose collaborative efforts.

These efforts rightly raise concerns about patient privacy. HIPAA regulations prevent physicians and health systems from sharing patient data inappropriately. But HIPAA does not prevent a patient from sharing their own data for research or other purposes, according to John Sharp, director of thought advisory at the Healthcare Information and Management Systems Society (HIMMS) and an adjunct professor at Kent State University.

This need not be a one-way street. "Patients do not receive data back from clinical trials that they participate in. If a clinical trial is meant to improve a patient's health, then patients should have immediate access to their data that contributed to a drug's approval as part of a trial," according to Fisher from

This "feedback loop" is critical to ensuring trust in the data-sharing process. This has become particularly salient during the COVID vaccine trials, which thousands of Americans have participated in without knowing whether they are being vaccinated against a potentially deadly disease. At the very least, companies could share their results with the trial participants, rather than having them learn about the results only when the data are published.

Ensuring access to a patient's health record is only a partial step in the right direction. Patients need to be in charge of their own data, and that means determining how to use and potentially profit from those data. We owe this to patients. For too long we have denied them a seat at the table when it comes to managing health data.

Ravi B. Parikh, MD, MPP, is a medical oncologist and faculty member at the University of Pennsylvania and the Philadelphia VA Medical Center, an adjunct fellow at the Leonard Davis Institute of Health Economics, and senior clinical advisor at the Coalition to Transform Advanced Care (C-TAC). His research and writing focus on policy and innovation in cancer care, with specific interests in advanced illness and predictive analytics.

Follow Medscape on Facebook, Twitter, Instagram, and YouTube


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.