Pseudo-prospective paraclinical interaction of radiology residents with a deep learning system for the detection of prostate cancer: experience, performance and identification of the need for intermittent recalibration

This article was originally published here

Invest Radiol. 2022 April 22. doi: 10.1097/RLI.0000000000000878. Online ahead of print.


OBJECTIVES: The aim of this study was to estimate the prospective utility of a previously retrospectively validated convolutional neural network (CNN) for the detection of prostate cancer (PC) on magnetic resonance imaging (MRI) of prostate.

Materials and methods: The biparametric part (T2-weighted and diffusion-weighted) of the multiparametric clinical MRI of the prostate of consecutive men included between November 2019 and September 2020 was analyzed in a fully automatic and individual way by a CNN shortly after image acquisition (pseudoprospective design). Radiology residents performed 2 assessments of the Prostate Imaging Data and Reporting System (PI-RADS) of the multiparametric dataset independently of clinical reporting (paraclinical design) before and after review of the results of CNN and took a survey. The presence of clinically significant PC was determined by the presence of International Society of Urological Pathology grade 2 or higher PC on systematic combined focused and extensive transperineal MRI/transrectal ultrasound biopsy. Patient and prostate sextant sensitivities and specificities were compared using the McNemar test and compared to the CNN Receiver Operating Characteristic (ROC) curve. The survey results were summarized as absolute numbers and percentages.

RESULTS: A total of 201 men were included. The CNN achieved an ROC area under the curve of 0.77 on a patient basis. Using the PI-RADS ≥3 emulating probability threshold (c3), CNN had a patient-based sensitivity of 81.8% and specificity of 54.8%, not statistically different from the PI-RADS ≥4 assessment current clinical routine at 90.9% and 54.8%, respectively (P=0.30/P=1.0). In general, residents achieved similar sensitivity and specificity before and after the CNN examination. On a prostate sextant basis, clinical assessment had the highest ROC area under the curve of 0.82, superior to CNN (AUC=0.76, P=0.21) and significantly superior to resident performance before and after CNN examination (AUC = 0.76/0.76, P ≤ 0.03). The resident survey indicated that CNN was useful and clinically useful.

CONCLUSIONS: Pseudo-prospective paraclinical integration of fully automated CNN-based detection of suspicious lesions on multiparametric MRI of the prostate was demonstrated and showed good acceptance among residents, while no significant improvement in resident performance was found. Overall CNN performance was preserved despite an observed change in CNN calibration, identifying the need for continuous quality control and recalibration.

PMID:35467572 | DO I:10.1097/RLI.0000000000000878

Patrick L. Williams