Reproducibility of the WHO Histological Criteria for the Diagnosis of Philadelphia Chromosome-Negative Myeloproliferative Neoplasms

Umberto Gianelli; Anna Bossi; Ivan Cortinovis; Elena Sabattini; Claudio Tripodo; Emanuela Boveri; Alessia Moro; Riccardo Valli; Maurilio Ponzoni; Ada M Florena; Giulio F Orcioni; Stefano Ascani; Emanuela Bonoldi; Alessandra Iurlo; Luigi Gugliotta; Vito Franco


Mod Pathol. 2014;27(6):814-822. 

Agreement Among the Reviewers on the Morphological Features

Table 2 shows the percentage of patients for whom at least 3 out 4 reviewers have classified in the same way each category of the 18 morphological variable considered. We found that high levels of crude agreement (≥70%) have been reached for all of the morphological features, with the exception of the presence of naked nuclei (65%). Interestingly, the percentage of crude agreement varied among reviewers in relation to the specific disease: eg, considering the bone marrow cellularity, the percentage of agreement among reviewers was higher in cases of primary myelofibrosis and polycythaemia vera (93% and 92%, respectively) than in essential thrombocythaemia (83%); focusing on the myeloid-to-erythroid ratio, the percentage of agreement was higher in essential thrombocythaemia or primary myelofibrosis (about 80%) than in polycythaemia vera (40%). Moreover, agreement among reviewers also varied in relation to the different category of each single morphological variable: eg, considering bone marrow cellularity, the percentage of agreement in primary myelofibrosis ranged from 84% when 'increased' to 9% if 'normal'; focusing on the myeloid-to-erythroid ratio, in cases of essential thrombocythaemia the percentage of agreement varied from 71% when normal to 9% if increased.

Relationship and Agreement Between Morphological Features and 'Consensus' Diagnosis

To calculate the relationship between the reviewers' evaluations on each morphological variable and 'consensus' diagnosis, we selected a subset of 11 morphological features that had turned out the most statistically useful for the differential diagnosis among the three myeloproliferative neoplasms.

The 11 morphological variables selected for all further analysis were the following: overall bone marrow cellularity, amount and left-shifting erythropoiesis, amount and left-shifting granulopoiesis, myeloid-to-erythroid ratio, dense clusters of megakaryocytes, pleomorphic clusters of megakaryocytes, hyperlobulation or bulbous appearance of the nuclei and the grading of marrow fibrosis.

Figures 2a–k shows the graphs obtained by the multiple correspondence analysis. To allow an easy reading of the graphs, the 11 morphological features were reproduced separately, and the reader should only consider the distance between the points and their positions on the plane identified by the first two axes. The horizontal axis contrasts the morphological profiles of the patients affected by essential thrombocythaemia vs those by primary myelofibrosis while the vertical axis contrasts polycythaemia vera vs essential thrombocythaemia and primary myelofibrosis. The more different are the morphological profiles that characterize the diagnoses, the higher is the distance between the points that represent the three diagnoses. Along with this line, the higher is the agreement between reviewers for each morphological features, the lower is the distance between the points that represent them. If a category of a morphological feature and a particular diagnosis are plotted nearby, this means that such a morphological parameter is typical of that particular diagnosis.

Results reported in Figures 2a–k can be summarized as follows: (1a) bone marrow cellularity: we found a good agreement among reviewers when it was regarded as 'normal' or 'increased'. As expected, the 'normal' category was more frequently associated by the four reviewers to essential thrombocythaemia; (1b) amount of erythropoiesis: a good agreement has been found when erythropoiesis increases and this category has been more frequently associated to polycythaemia vera cases; (1c) left-shifting erythropoiesis: a good agreement has been found when this category was 'absent' and this modality do not support a diagnosis of polycythaemia vera; (1d) amount of granulopoiesis: a moderate agreement has been reached when 'normal' or 'increased' and lack agreement when 'reduced'. Normal granulopoiesis resulted more frequently associated with essential thrombocythaemia, while increased granulopoiesis with primary myelofibrosis; (1e) left-shifting granulopoiesis: good agreement has been reached when 'absent' and this category was more frequently observed in essential thrombocythaemia; (1f) myeloid-to-erythroid ratio: a good agreement has been reached when increased and this appeared more frequent in primary mielofibrosis, while only a moderate agreement when 'normal' (more frequently associated with essential thrombocythaemia) or 'reduced' (more frequently associated with polycythaemia vera); (1g) dense clusters of megakaryocytes: good agreement was reported when 'absent 'and moderate-to-good agreement when 'present'. Presence of dense clusters of megakaryocytes has been observed more frequently in primary myelofibrosis; (1h) pleomorphic clusters of megakaryocytes: a good agreement was found when 'absent' and this modality do not support the diagnosis of polycythaemia vera; (1l) hyperlobulated nuclei of megakaryocytes: moderate agreement was reached for both the absent or present categories; (1j) bulbous nuclei of megakaryocytes: good agreement was reached when present (more frequent in primary myelofibrosis); and (1k) grade of marrow fibrosis: good agreement was found in grade 0 and in grade 2, while moderate agreement in grade 1.

Agreement Among 'Personal' and 'Consensus' Diagnosis

Focusing on the 11 selected morphological features, we further investigated if a morphological analysis alone was sufficient per se to reach a correct diagnosis of Philadelphia chromosome-negative myeloproliferative neoplasms even in the absence of clinical data.

We calculated the percentage of crude agreement between the 'personal' diagnosis and 'consensus' diagnosis. A case has been considered classified (agreement reached) when at least three out four reviewers made the same 'personal' diagnosis corresponding to the 'consensus' one. We found that morphology alone allow to correctly classify 72% of the cases (Table 3).

Moreover, we calculated the agreement between 'personal' diagnosis of each of the four reviewers and 'consensus' diagnosis. The results indicate a higher percentage of crude agreement ranging from 65 to 84% (mean value: 76%), with moderate-to-good values of Cohen's kappa statistic (ranging from 0.41 to 0.80) (Table 4).