Evaluating the Optimum Number of Biopsies to Assess Histological Inflammation in Ulcerative Colitis

A Retrospective Cohort Study

Robert Battat; Niels Vande Casteele; Rish K. Pai; Zhongya Wang; Guangyong Zou; John W. D. McDonald; Marjolijn Duijvestein; Jenny Jeyarajah; Claire E. Parker; Tanja Van Viegen; Sigrid A. Nelson; Brigid S. Boland; Siddharth Singh; Parambir S. Dulai; Mark A. Valasek; Brian G. Feagan; Vipul Jairath; William J. Sandborn


Aliment Pharmacol Ther. 2020;52(10):1574-1582. 

In This Article

Materials and Methods

Patient Population and Study Samples

Ulcerative colitis biopsy fragments collected between 15 July 2014 and 24 July 2017 were retrospectively obtained from the University of California, San Diego Inflammatory Bowel Disease biobank using pre-specified inclusion criteria. For the primary analysis, we evaluated patients for whom a minimum of four rectosigmoid biopsies were available. The IBD biobank consists of DNA, blood, stool and tissue samples, in addition to endoscopic video recordings, that are prospectively collected from IBD patients who have given informed consent. Clinical disease activity was assessed using the Partial Mayo Clinic Score, stool frequency subscore and rectal bleeding subscore. Endoscopic videos were recorded and stored using Robarts Central Image Management Solutions (CIMS). During endoscopy, biopsies were obtained from each patient in a standardised and pre-specified manner, with two samples collected during each pass of the biopsy forceps. In patients undergoing colonoscopy or flexible sigmoidoscopy, biopsies were obtained from the most macroscopically inflamed area in the right colon, left colon or the rectosigmoid colon (or the rectum in the case of isolated proctitis). Additional biopsies were collected from the transverse colon when colorectal cancer surveillance was carried out.

Biopsy fragments were processed as formalin-fixed paraffin-embedded tissue specimens using 10% neutral formalin and stained with haematoxylin and eosin. For all patients, biopsies were procured from a single endoscopic procedure.

Central Reading and Endpoint Assessment

For the purposes of this study, a digital scan of each included biopsy was created using an Aperio AT2 (Leica) whole slide scanner. These images were stored in a WebMicroscope (Fimmic) database that was hosted on a secure remote server.

Histological image quality was centrally read by a blinded histopathologist (MV) using a global rating scale (ie optimal; adequate; poor but readable; poor, not readable) and any specific concerns (eg biopsy size, scan quality, slide staining, etc) were noted. To assess histological disease activity, a histopathologist (MV) blindly and independently read each biopsy using the RHI and GS. The central reader also provided a global assessment of histological disease activity using a 10 cm visual analogue scale (VAS), where 0 indicates completely normal and 10 indicates the worst disease possible.

Endoscopic videos from the University of California, San Diego biobank were made available for expert central reading through the secure Central Image Management Solutions server. Two blinded central readers (RB and JWDM) evaluated endoscopic video quality using a global rating scale (ie optimal, sub-optimal, not readable) and any specific concerns (eg rapid withdrawal, insufficient insufflation, inadequate washing, etc) were noted. Each endoscopic video was blindly and independently assessed by two central readers (RB and JWDM) using the Mayo Clinic Endoscopic Subscore and Ulcerative Colitis Endoscopic Index of Severity. Electronic case report forms were used to record central reading data.

Data Analysis and Statistical Methods

Minimum Number of Biopsies Required to Assess Histological Inflammation. As noted above, data from patients who had at least four rectosigmoid biopsies stored in the University of California, San Diego biobank were analysed in the primary analysis. A sample of four biopsies was randomly selected using a pre-defined statistical algorithm if the patient had more than four rectosigmoid biopsies available.

The "4-biopsy reference score," which is a composite of the worst item-level ratings across four separate biopsies, was calculated using the RHI, GS and VAS. To give an example, if 1-biopsy had moderate lamina propria neutrophils (Geboes 2B.2) but this feature was absent in the other biopsies, the final lamina propria neutrophil score used in the Geboes 4-biopsy reference score would be 2B.2. This methodology was chosen because it reflects how histological scoring is performed in clinical trials when multiple biopsies are available, and it provides the most conservative estimate.

Agreement between the 4-biopsy reference score and score estimates based on one, two or three of the reference biopsies was evaluated. The 1-, 2- and 3-biopsy scores were calculated using the same methodology as the 4-biopsy reference score. Pre-defined subgroup analyses were conducted according to endoscopic disease activity (Mayo Clinic Endoscopic Subscore of 0 vs ≥1).

Agreement was evaluated using a bivariate errors-in-variable regression approach that generated tolerance intervals.[19] The tolerance interval is defined as the prediction interval which contains 95% of the population differences. Agreement is satisfied if the tolerance interval is completely contained in the acceptance interval, which was defined as ±0.25 standard deviation of the maximum reference score of each index. Thus, the acceptance intervals for the RHI, GS and VAS ranged from −7.50 to 7.50 and −8.25 to 8.25; −1.25 to 1.25; and −2.25 to 2.25 and −2.42 to 2.42; respectively.

In addition to evaluating agreement, we compared the rate of histological remission (RHI score ≤3 with neither lamina propria nor epithelial neutrophils) and mean RHI scores with corresponding standard deviations (SD) in the 4-biopsy reference score group to the 1-, 2- and 3-biopsy groups respectively. Continuous data, categorical data, binary data with small cells, and non-normal continuous data were tested using the t test, chi-square test, Fisher's exact test and Wilcoxon rank test respectively.

We also evaluated the relationship between biopsy number and endoscopic (as measured by the Mayo Clinic Endoscopic Subscore) and clinical (as measured by the Partial Mayo Clinic Score) disease activity to explore whether the number of biopsies procured was influenced by disease activity. A linear relationship between the variables would indicate that disease activity influenced number of biopsies procured, whereas a lack of a linear relationship would suggest disease activity did not influence number of biopsies procured.

Effect of Biopsy Location on Estimates of Histological Inflammation. For the disease location analysis, data from patients with extensive ulcerative colitis who had at least three biopsies collected from both the rectosigmoid colon and colonic segments proximal to the splenic flexure were used to evaluate whether biopsy location had an impact on estimates of histological disease activity. For this analysis, a minimum of three biopsies, rather than four, was chosen to ensure an adequate sample size given that only a subset of biobank patients had samples taken from colonic segments proximal to the splenic flexure.

Accordingly, a "total reference score," the composite of the worst item-level ratings across all available biopsies, was calculated for the rectosigmoid colon, and for colonic segments proximal to the splenic flexure, using the RHI, GS and VAS. Agreement between the "total reference" score from the rectosigmoid colon and colonic segments proximal to the splenic flexure was assessed using the bivariate errors approach described above. The rate of histological remission (RHI score ≤3 with neither lamina propria nor epithelial neutrophils) in the rectosigmoid-only biopsy group compared to the group with biopsies taken from all colonic segments was also evaluated using the methods described above.

Construct Validity of Histological Measures. The construct validity of the differing approaches to histological scoring was compared by calculating the correlation between centrally read histological (ie the RHI, GS and VAS) centrally read endoscopic (ie the centrally read Mayo Clinic Endoscopic Subscore and Ulcerative Colitis Endoscopic Index of Severity), and clinical (ie the PCMS, stool frequency subscore, rectal bleeding subscore) measures of disease activity. Validity was quantified based on the point estimates of the correlation coefficients and associated 95% confidence intervals, which were obtained using bootstrap methods. All of the study population was included in these analyses, and the results were stratified according to endoscopic (as measured by the Mayo Clinic Endoscopic Subscore) and clinical (as measured by the Partial Mayo Clinic Score and rectal bleeding subscore) disease activity.


All patients included in the University of California, San Diego biobank provided written informed consent. The institutional review board at University of California, San Diego approved the study protocol and materials. All subject data were de-identified and anonymised prior to being uploaded for central reading. The study was conducted in compliance with the Declaration of Helsinki.