Department of Electronics and Communication Engineering, S V University, Tirupati, Andhra Pradesh, India
It is important to accurately identify gastrointestinal (GI) diseases such as adenocarcinomas using Endoscopic Ultrasound (EUS) imaging to attain a timely diagnosis and ultimately better clinical outcomes. However, EUS images present challenges in terms of significant inherent noise, low contrast in color, and complex textural elements, thus making automated analysis a significant challenge toward reliability. Here we present AUTOEUS, a lightweight deep learning framework that improves upon prior models and is able to perform both anatomical region classification and detect adenocarcinomas from EUS images. The proposed pipeline includes a two-stage preprocessing procedure that implements median filtering for noise reduction and Y-channel histogram equalization for contrast improvement, leading to improved clarity in EUS images. In addition, a teacher/student knowledge distillation architecture has been deployed whereby a teacher network uses ResNet-50 to progressively improve the model as a compact convolutional student is guided, thus reducing the computational cost of the model while maintaining predictive accuracy. Experimental evaluations completed within MATLAB using augmented image datastores and five-fold binary classification metrics led to sound diagnostic performance (90.70% - cecum; 95.81% - ileum; 80.00% - pylorus; 90.23% - rectum; 90.23% - stomach). In general F1-scores also reached up to 97.06%. These performance metrics and visualization results confirm the model’s ability to diagnose true positive disease cases with confidence across a number of adenocarcinoma types.
Medical imaging is essential for disease diagnosis, surgical assistance, and clinical judgments. Among the imaging modalities, endoscopic ultrasound (EUS) is a powerful diagnostic tool for the detection of gastrointestinal pathology and submucosal lesions, as it permits high resolution internal images of the gastrointestinal organs. However, EUS image evaluation is still a very manual, laborious effort that is reliant on the subjective experience and visual impression of the radiologist. Image quality, speckle noise, and minor tissue texture variations negatively affect intra- and inter-observer reliability and increase subjectivity. Therefore, research efforts have inevitably moved toward automated and standardised EUS image classification frameworks. Previous methods for classification of medical images chiefly used handcrafted features, for instance, texture descriptors, edge statistics, or morphological properties [1]–[3]. Although these approaches offered early opportunities to investigate image-based diagnostics, their performance was constrained by poor generalization capabilities and vulnerability to variations in illumination and noise. The introduction of deep learning disrupted analysis of medical images by facilitating end-to-end feature extraction and classification. Models like Convolutional Neural Networks (CNNs), and region-based detectors such as Faster R-CNN [4], [9] have led to great successes in intricate imaging scenarios like cancer diagnosis, segmentation, and localization of disease. More contemporary work has continued in multi-feature fusion, attention mechanisms, and ensemble learning which further improved upon diagnostic accuracy [2], [6], [10]. Although technological advancements are occurring, deep learning models remain computationally expensive since they require large labelled datasets and high-end GPUs for training and inference. This remains a significant limitation for real-time or embedded clinical environments where models must be both efficient and interpretable. Generally, EUS datasets contain many instances that are similar across classes and show much variability within classes; due to these characteristics the overall model is more difficult to generalize given the typically limited training data. Effective solutions need to accomplish the objective of developing a lightweight but accurate model that maintains diagnostic accuracy while being computationally efficient. To address these challenges, we propose AUTOEUS, a lightweight deep learning framework for EUS image classification that fuses image enhancement, knowledge distillation, and compact CNN modeling. The proposed system includes a preprocessing pipeline that employs a combination of median filtering and Y-channel histogram equalization to increase contrast and suppress noise prior to beginning training. The model learns the high-level semantic representations using a CNN-based teacher system (ResNet-50) and the teacher's knowledge is distilled to a student model, which is smaller in size by design, hence optimized for lower complexity and more rapid inference. Overall, the teacher-student methodology is a strategy to extract the teacher's semantic knowledge to a smaller student model while preserving strong discriminative power and limited resource consumption. The paper's remaining sections will be organized as follows: in Section II we discuss the literature survey of related works. In Section III we present our methodology and system design of AUTOEUS. Section IV presents our experimental setup and performance results. Finally, the paper is concluded in Section V along with future research suggestions.
LITERATURE SURVEY
Medical image analysis has evolved significantly over the Over the last 20 years, many studies have claimed to develop methods for image enhancement, segmentation, and classification using deep learning–based algorithms. The earliest studies focused on a preprocessing approach to enhance the visual quality or diagnostic accuracy of the images. For instance, Yan and Guohua [1] proposed a direct image enhancement technique and showed improved visibility of subtle structures in medical images, which directly improves classification accuracy. In a similar manner, Bo et al. [2] suggested a multi-feature fusion strategy in scale space development which permits a holistic representation of image patterns and enhances classification accuracy. Segmentation is an important step in medical image interpretation. Jinmei and Zuoyong [3] introduced an both an improved mathematical morphology algorithm for medical image segmentation that preserves boundaries better but is less susceptible to noise. Some progress has come in an image understanding task with deep learning methods. Li et al. [4] implement the Faster R-CNN framework for cancer image detection and ascertained improved object localization and classification within both histopathological and radiological data. In recent publications, researchers have attempted semi-supervised and attention approaches to handle limited labeled data and model contextual information. Bakalo et al. [5] presented a deep dual-branch network for weakly and semi-supervised medical image detection, and had high applicability to partially annotated datasets. Similarly, An and Liu [6] developed a multilayer boundary perception – self attention model for medical image segmentation, reporting that it provided better feature extractions with knowledge of boundaries. Noise suppression is an additional dimension in ultrasound imaging. Pradeep and Nirmaladevi [7] surveyed different methods for speckle noise suppression in spatial, transform, and CNN methods, and highlighted the importance of preprocessing for increased image clarity. To optimize training on small datasets, Masquelin et al. [8] utilized wavelet decomposition as pretraining, and demonstrated that frequency-based preprocessing can facilitate fast deep learning on small medical datasets. The study of Ren et al. [9] on Faster R-CNN presented a general framework for real-time object detection which leveraged region proposal networks and this caught the interest of medical imaging researchers. Huilan and Hui [10] studied the use of an iterative training and ensemble learning framework to enhance classification accuracy, which has implications for ensemble feature learning and knowledge distillation. Hao et al. [11] provided a holistic examination of image enhancement algorithms, focusing on image enhancement applications and increasing diagnostic certainty. Additionally, Teng et al. [12] introduced an image similarity-based recognition framework that used frame sequence analysis, exemplifying a broader interest in using temporal and contextual cues in image-based recognition tasks in medical imaging. Collectively, these studies represent the development from initial image processing technologies to more advanced deep learning architectures. Insights from prior work have informed the proposed AUTOEUS framework, which improves upon previous efforts by combining enhanced preprocessing strategies, knowledge distillation, and using lightweight CNNs to effectively classify endoscopic ultrasound (EUS) images.
METHODOLOGY
The AUTOEUS framework discussed in this paper aims to accurately and efficiently classify Endoscopic Ultrasound (EUS) images using advanced preprocessing, data augmentation, and a knowledge distillation-based lightweight deep learning architecture. The methodology includes several phases: dataset preparation, image preprocessing, data augmentation, teacher-student network training, and performance evaluation. The overall workflow is presented in the block diagram and flowchart (Figs. 1 and 2).
Done. Rama Shanthi*, I. Kullayamma, AUTOEUS: Smart Detection of Gastrointestinal Abnormalities Using Lightweight Deep Learning, Int. J. Sci. R. Tech., 2025, 2 (12), 147-155. https://doi.org/10.5281/zenodo.17876958
10.5281/zenodo.17876958