| International Journal of Computer Applications |
| Foundation of Computer Science (FCS), NY, USA |
| Volume 187 - Number 71 |
| Year of Publication: 2026 |
| Authors: Padmashree G., Murali G. Rao |
10.5120/ijca2026925626
|
Padmashree G., Murali G. Rao . Local–Global Feature Fusion using CNN and Vision Transformer with Ensemble Post-Classification for Diabetic Retinopathy Diagnosis. International Journal of Computer Applications. 187, 71 ( Jan 2026), 15-24. DOI=10.5120/ijca2026925626
Diabetic retinopathy is a leading cause of vision impairment globally, necessitating timely and accurate diagnosis to prevent irreversible damage. This paper proposes a novel hybrid deep learning framework that combines local and global feature representations for robust DR classification from retinal fundus images. Local features are extracted using a convolutional neural network branch that captures fine-grained pathological patterns such as microaneurysms and hemorrhages. Simultaneously, global contextual features are learned through a Vision Transformer, which models long-range dependencies across the retinal image. The extracted features from both branches are fused and passed through a series of dense layers for initial classification. To further enhance generalization and interpretability, features from the Global Average Pooling layer are used to train a Random Forest classifier. The proposed methodology is evaluated on a benchmark DR dataset with five severity classes. Extensive experiments and ablation studies demonstrate the effectiveness of our architecture in capturing both fine-grained and holistic features, leading to improved classification performance. Our results suggest that the fusion of local and global features, combined with ensemble post-classification, can provide a robust and scalable solution for automated DR diagnosis.