From Retinal Pixels to Patients

This is a short summary of our paper “From Retinal Pixels to Patients: Evolution of Deep Learning Research in Diabetic Retinopathy Screening” by Muskaan Chopra, Lorenz Sparrenberg, Armin Berger, Sarthak Khanna, Jan H. Terheyden, and Rafet Sifa, published in the proceedings of the 2025 IEEE International Conference on Big Data. See here for the published version and here for the open-access arxiv version.

Deep learning has changed diabetic retinopathy screening, but the field has also learned a hard lesson: strong benchmark numbers do not automatically mean clinical readiness.

This survey looks at the evolution of deep learning research for diabetic retinopathy screening from 2016 to 2025, covering more than 50 studies and over 20 datasets. The paper traces the field from early CNN-based proof-of-concept systems to newer methods involving self-supervised learning, domain generalization, federated learning, uncertainty estimation, attention mechanisms, and neuro-symbolic models.

A major theme is the gap between technical performance and clinical deployment. Early models often reported very high internal accuracy, but later work showed that performance could drop sharply under external validation because of differences in cameras, patient populations, grading protocols, dataset splits, and label quality.

The survey argues that the field needs more than better architectures. It needs reproducible code, public datasets, per-patient evaluation, external validation, calibration, uncertainty reporting, and clinically meaningful interpretability. In other words, the problem is no longer only “can a model classify fundus images?” but “can it be trusted across hospitals, devices, and patient groups?”

Figure 1: Deep Learning for Diabetic Retinopathy

TL;DR: Diabetic retinopathy AI has moved from impressive CNN benchmarks toward a harder and more important question: how to build screening systems that are reproducible, generalizable, interpretable, and clinically trustworthy.