Prospective validation of a new imaging scorecard to assess leptomeningeal metastasis: A joint EORTC BTG and RANO effort
Emilie Le Rhun, , Patrick Devos, Sebastian Winklhofer, Hafida Lmalem, Dieta Brandsma, Priya Kumthekar, Antonella Castellano, Annette Compter, Frederic Dhermain, Enrico Franceschi, Peter Forsyth, Julia Furtner, Norbert Galldiks, Jaime Gállego Pérez-Larraya, Jens Gempt, Elke Hattingen, Johann Martin Hempel, Slavka Lukacova, Giuseppe Minniti, Barbara O’Brien, Tjeerd J Postma, Patrick Roth, Roberta Rudà , Niklas Schaefer, Nils O Schmidt, Tom J Snijders, Steffi Thust, Martin van den Bent, Anouk van der Hoorn, Guillaume Vogin, Marion Smits, Joerg C Tonn, Kurt A Jaeckle, Matthias Preusser, Michael Glantz, Patrick Y Wen, Martin Bendszus, Michael Weller,
Neuro Oncol
https://doi.org/10.1093/neuonc/noac043
Abstract
Background: Validation of the 2016 RANO MRI scorecard for leptomeningeal metastasis failed for multiple reasons. Accordingly, this joint EORTC Brain Tumor Group and RANO effort sought to prospectively validate a revised MRI scorecard for response assessment in leptomeningeal metastasis. Methods: Coded paired cerebrospinal MRI of 20 patients with leptomeningeal metastases from solid cancers at baseline and follow-up after treatment and instructions for assessment were provided via the EORTC imaging platform. The Kappa coefficient was used to evaluate the interobserver pairwise agreement. Results: Thirty-five raters participated, including 9 neuroradiologists, 17 neurologists, 4 radiation oncologists, 3 neurosurgeons, and 2 medical oncologists. Among single leptomeningeal metastases-related imaging findings at baseline, the best median concordance was noted for hydrocephalus (Kappa = 0.63), and the worst median concordance for spinal linear enhancing disease (Kappa = 0.46). The median concordance of raters for the overall response assessment was moderate (Kappa = 0.44). Notably, the interobserver agreement for the presence of parenchymal brain metastases at baseline was fair (Kappa = 0.29) and virtually absent for their response to treatment. 394 of 700 ratings (20 patients x 35 raters, 56%) were fully completed. In 308 of 394 fully completed ratings (78%), the overall response assessment perfectly matched the summary interpretation of the single ratings as proposed in the scorecard instructions. Conclusion: This study confirms the principle utility of the new scorecard, but also indicates the need for training of MRI assessment with a dedicated reviewer panel in clinical trials. Electronic case report forms with “blocking options” may be required to enforce completeness and quality of scoring.