Coen de Vente • Estimating Uncertainty of Deep Neural Networks for Age-related Macular Degeneration Grading using Optical Coherence Tomography

Abstract

Purpose

Deep convolutional neural networks (CNNs) are increasingly being used for eye disease screening and diagnosis. Especially the best performing variants, however, are generally overconfident in their predictions. For usefulness in clinical practice and increasing clinicians’ trust on the estimated diagnosis, well-calibrated uncertainty estimates are necessary. We present a method for providing confidence scores of CNNs for age-related macular degeneration (AMD) grading in optical coherence tomography (OCT).

Methods

1,264 OCT volumes from 633 patients from the European Genetic Database (EUGENDA) were graded as one of five stages of AMD (No AMD, Early AMD, Intermediate AMD, Advanced AMD: GA, and Advanced AMD: CNV). Ten different 3D DenseNet-121 models that take a full OCT volume as input were used to predict the corresponding AMD stage. These networks were all trained on the same dataset. However, each of these networks were initialized differently. The class with the maximum average softmax output of these models was used as the final prediction. The confidence measure was the normalized average softmax output for that class.

Results

The algorithm achieved an area under the Receiver Operating Characteristic of 0.9785 and a quadratic-weighted kappa score of 0.8935. The mean uncertainty, calculated as 1 - the mean confidence score, for incorrect predictions was 1.9 times as high as the mean uncertainty for correct predictions. When only using the probability output of a single network, this ratio was 1.4. Another measure for uncertainty estimation performance is the Expected Calibration Error (ECE), where a lower value is better. When comparing the method to the probability output of a single network, the ECE improved from 0.0971 to 0.0324. Figure 1 shows examples of both confident and unconfident predictions.

Conclusions

We present a method for improving uncertainty estimation for AMD grading in OCT, by combining the output of multiple individually trained CNNs. This increased reliability of system confidences can contribute to building trust in CNNs for retinal disease screening. Furthermore, this technique is a first step towards selective prediction in retinal disease screening, where only cases with high uncertainty predictions need to be referred for expert evaluation.

ARVO Presentation

Publication

Please use this when referring to the post or publication:

C. de Vente, M. van Grinsven, S. De Zanet, A. Mosinska, R. Sznitman, C. Klaver and C.I. Sánchez. "Estimating Uncertainty of Deep Neural Networks for Age-related Macular Degeneration Grading using Optical Coherence Tomography", in: Association for Research in Vision and Ophthalmology, 2020.

URL

@conference{Vent20, author = {de Vente, Coen and van Grinsven, Mark and De Zanet, Sandro and Mosinska, Agata and Sznitman, Raphael and Klaver, Caroline and S\'{a}nchez, Clara I.}, booktitle ={Association for Research in Vision and Ophthalmology}, title = {Estimating Uncertainty of Deep Neural Networks for Age-related Macular Degeneration Grading using Optical Coherence Tomography}, url={https://iovs.arvojournals.org/article.aspx?articleid=2769262}, Methods: 1,264 OCT volumes from 633 patients from the European Genetic Database (EUGENDA) were graded as one of five stages of AMD (No AMD, Early AMD, Intermediate AMD, Advanced AMD: GA, and Advanced AMD: CNV). Ten different 3D DenseNet-121 models that take a full OCT volume as input were used to predict the corresponding AMD stage. These networks were all trained on the same dataset. However, each of these networks were initialized differently. The class with the maximum average softmax output of these models was used as the final prediction. The confidence measure was the normalized average softmax output for that class. Results: The algorithm achieved an area under the Receiver Operating Characteristic of 0.9785 and a quadratic-weighted kappa score of 0.8935. The mean uncertainty, calculated as 1 - the mean confidence score, for incorrect predictions was 1.9 times as high as the mean uncertainty for correct predictions. When only using the probability output of a single network, this ratio was 1.4. Another measure for uncertainty estimation performance is the Expected Calibration Error (ECE), where a lower value is better. When comparing the method to the probability output of a single network, the ECE improved from 0.0971 to 0.0324. Figure 1 shows examples of both confident and unconfident predictions. Conclusions: We present a method for improving uncertainty estimation for AMD grading in OCT, by combining the output of multiple individually trained CNNs. This increased reliability of system confidences can contribute to building trust in CNNs for retinal disease screening. Furthermore, this technique is a first step towards selective prediction in retinal disease screening, where only cases with high uncertainty predictions need to be referred for expert evaluation.}, year = {2020}, month = {6}, scholar_id = {13938730969248371423} }

Estimating Uncertainty of Deep Neural Networks for Age-related Macular Degeneration Grading using Optical Coherence Tomography

20 June 2020

Presentation at ARVO (PPTX)

Abstract

Purpose

Methods

Results

Conclusions

ARVO Presentation

Publication

Other posts

An interview on robust AI for medical imaging with the University of Amsterdam

16 June 2023

3D Diffusion Models for Standardized High-Quality OCTs

26 March 2023

AIROGS Challenge Report: AI models can be used for glaucoma screening, but do they know when they cannot?

10 May 2022

More projects

Want to get in contact?

Estimating Uncertainty of Deep Neural Networks for Age-related Macular Degeneration Grading using Optical Coherence Tomography 20 June 2020 Presentation at ARVO (PPTX)

Abstract

Purpose

Methods

Results

Conclusions

ARVO Presentation

Publication

Other posts

An interview on robust AI for medical imaging with the University of Amsterdam

16 June 2023

3D Diffusion Models for Standardized High-Quality OCTs

26 March 2023

AIROGS Challenge Report: AI models can be used for glaucoma screening, but do they know when they cannot?

10 May 2022

More projects

Want to get in contact?

Estimating Uncertainty of Deep Neural Networks for Age-related Macular Degeneration Grading using Optical Coherence Tomography

20 June 2020

Presentation at ARVO (PPTX)