Water bodies are increasingly polluted due to the accumulation of floating waste such as plastics, organic debris, and other solid materials. These pollutants negatively impact aquatic ecosystems and human health. Traditional monitoring approaches rely on manual observation, which is time-consuming and not suitable for continuous surveillance.
Recent developments in deep learning have enabled automated object detection systems capable of identifying multiple objects in real-time. Among these, the YOLO (You Only Look Once) models are widely used for real-time object detection [1].
This work proposes a YOLOv8-based system to detect garbage in water bodies. The system is designed to classify multiple categories of waste and provide real-time detection results, making it suitable for environmental monitoring applications.
LITERATURE REVIEW
The application of deep learning in environmental monitoring has gained significant attention in recent years, particularly for waste detection and classification tasks. Convolutional Neural Networks (CNNs) have proven highly effective in extracting spatial features from images, enabling accurate object detection in complex environments.
The YOLO (You Only Look Once) family of models introduced a single-stage detection mechanism that significantly improved real-time object detection performance. Subsequent versions, including YOLOv5 and YOLOv8, further enhanced detection accuracy and computational efficiency.
In the context of aquatic environments, the AquaVision dataset and benchmark provide a comprehensive framework for analyzing visual data in water-based scenarios. The AquaVision study highlights challenges such as light reflection, occlusion, and dynamic backgrounds, which significantly impact detection accuracy.
Additionally, datasets such as the Trash Dataset have been widely used for training models to detect waste objects across multiple categories. These datasets enable robust training and evaluation of object detection models in real-world scenarios.
Despite these advancements, detecting garbage in water bodies remains challenging due to environmental variability. This study leverages YOLOv8 along with a curated dataset to address these limitations and improve detection performance.
PROPOSED METHODOLOGY
The proposed system employs a deep learning-based object detection approach using the YOLOv8 model to identify garbage in water bodies. The methodology consists of dataset preparation, preprocessing, model training, and detection.
A labeled dataset of waste images is used, where objects are annotated in YOLO format with bounding box coordinates and class labels. The dataset is divided into training and validation sets to ensure effective learning and evaluation.
Input images are preprocessed through resizing and normalization, and data augmentation techniques are applied to improve robustness against variations in lighting and background conditions.
The YOLOv8 model is initialized with pre-trained weights and trained on the prepared dataset. During training, the model learns feature representations and object localization through forward propagation, loss computation, and backpropagation.
The trained model is evaluated using standard performance metrics such as precision, recall, and mean Average Precision (mAP). Finally, the model is used to perform inference on new images, producing detection outputs in the form of bounding boxes and class labels.
This methodology enables efficient and real-time detection of garbage in water bodies.
SYSTEM OVERVIEW
The proposed system uses a deep learning pipeline to detect garbage in water bodies from images. The workflow includes dataset preparation, model training, validation, and prediction. The overall system architecture of the proposed model is shown in Fig. 1. The system takes input images or video frames of water bodies, which are preprocessed before being passed to the YOLOv8 detection model. The model identifies garbage objects and generates output in the form of bounding boxes along with class labels.
Fig. 1. System Architecture
DATASET DESCRIPTION
The dataset used in this study is derived from publicly available waste classification datasets, such as the TrashNet dataset, which contains labeled images of common waste materials. The dataset includes categories such as cardboard, glass, metal, paper, plastic, and other waste items. For the purpose of this work, the dataset is adapted and extended to represent floating waste scenarios in water bodies. The images are preprocessed and annotated in YOLO format, where each object instance is labeled with corresponding class identifiers and bounding box coordinates. The dataset is further divided into training and validation subsets to facilitate effective model learning and performance evaluation. The workflow of the proposed system is illustrated in Fig. 2. It begins with data collection and annotation, followed by dataset splitting into training and validation sets. The YOLOv8 model is trained and validated, and the final model is used to detect garbage in new input images.
Fig. 2. Workflow of the model
Algorithm Used: YOLOv8
The YOLOv8 model is a single-stage object detection algorithm that performs object localization and classification simultaneously. The working of the YOLOv8 model is shown in Fig. 3. The model consists of backbone, neck, and head components responsible for feature extraction, feature aggregation, and object detection, respectively. It predicts bounding boxes and class labels for detected garbage objects.
Key Working Principle:
- The input image is divided into grids
- Each grid predicts bounding boxes and class probabilities
- Non-Maximum Suppression (NMS) is applied to remove duplicate detections
Advantages:
- High detection speed
- Real-time performance
- Good accuracy for multiple object classes
Fig. 3. YOLOv8 Model – Working Diagram
TRAINING PROCESS
The dataset used in this study consists of approximately 1000 images collected from publicly available sources such as Kaggle. The dataset includes 9 classes, including plastic, paper, grass, glass ,bottle, branch ,box, plastic garbage and ball waste. The dataset is divided into training (70%), validation (20%), and testing (10%) sets to ensure proper evaluation of the model.
The model is initialized using a pre-trained YOLOv8n (nano) model and trained on the custom dataset.
Training parameters used:
- Epochs: 20
- Image size: 512 × 512
- Batch size: 8
- Data augmentation: horizontal flipping
The training process updates model weights to improve detection accuracy across all classes
EVALUATION METRICS
The performance of the model is evaluated using:
- Precision: Correct positive predictions
- Recall: Ability to detect all relevant objects
- mAP@0.5: Mean Average Precision at IoU threshold 0.5
Additionally, confusion matrix analysis is performed to evaluate class-wise performance.
IMPLEMENTATION
The system is implemented using Python and deep learning libraries.
Tools and frameworks used:
- Ultralytics YOLOv8
- OpenCV for image processing
- NumPy and Pandas for analysis
- Matplotlib and Seaborn for visualization
The dataset is downloaded, preprocessed, and structured using a YAML configuration file. The model is trained and validated using the Ultralytics framework.
The training pipeline of the proposed YOLOv8 model is illustrated in Fig. 4. The process begins with the annotated dataset, which undergoes preprocessing steps such as resizing, normalization, and data augmentation. The model is then initialized using pre-trained weights and trained through forward propagation, loss computation, and backpropagation. During validation, performance metrics such as mean Average Precision (mAP) and loss are monitored. The best-performing model is saved and used for testing and inference, where detection results are generated and visualized.
Fig. 4. Training pipeline of the YOLOv8 model
Dataset Samples
The dataset used in this study is divided into testing, training, and validation subsets. The testing dataset sample is shown in Fig. 5. These images are used after training to assess the final performance of the model. The model's ability to accurately detect garbage in these images demonstrates its effectiveness in real-world scenarios.
Fig. 5. Test Image
The training dataset sample is shown in Fig. 6. These images are used to train the YOLOv8 model to learn the features and patterns of different types of garbage present in water bodies. Each image is annotated with bounding boxes and corresponding class labels in YOLO format.
Fig. 6. Train Image
The validation dataset samples are illustrated in Fig. 7. This subset is used to evaluate the model during training and to monitor its performance on unseen data, helping to prevent overfitting and ensure generalization.
Fig. 7. Validate Image
RESULTS AND ANALYSIS
The trained model successfully detects garbage objects in water bodies across multiple categories. The output includes bounding boxes and class labels for detected objects.
Performance metrics indicate that the model achieves satisfactory precision and recall values, demonstrating its effectiveness for object detection in aquatic environments.
The confusion matrix and performance graphs show that the model performs well for common classes such as plastic bags and bottles. However, minor misclassifications occur due to overlapping features and environmental conditions such as reflections.
The system is capable of processing images in real-time, making it suitable for practical deployment.
The confusion matrix of the proposed model is shown in Fig. 8. It represents the classification performance of the YOLOv8 model across different garbage categories. The diagonal values indicate correctly classified instances, while off-diagonal values represent misclassifications. A higher concentration of values along the diagonal demonstrates that the model achieves good accuracy in detecting and classifying garbage objects.
Fig. 8. Confusion Matri
Performance Metrics
To evaluate the performance of the proposed garbage detection model, standard evaluation metrics such as Accuracy, Precision, and Recall are used. These metrics are computed based on the confusion matrix.
Accuracy
Precision
Recall
Where:
- TP (True Positive): Correctly detected garbage
- TN (True Negative): Correctly detected non-garbage
- FP (False Positive): Incorrect detection of garbage
- FN (False Negative): Missed garbage detection
The overall performance of the model across different classes is illustrated in Fig. 9. The line graph shows variations in evaluation metrics such as precision, recall, and F1-score for each class. The results indicate consistent performance of the model, with higher values observed for well-represented classes in the dataset.
Fig. 9. Model Performance (Line graph)
Fig. 10. presents a comparative analysis of precision, recall, and specificity for each class. The bar graph highlights the model’s ability to correctly identify garbage objects while minimizing false positives and false negatives. The results demonstrate balanced performance across multiple classes, indicating the effectiveness of the proposed approach.
Fig. 10. Precision, Recall & Specificity per class graph
The detection results produced by the YOLOv8 model are shown in Fig. 11. The model successfully identifies garbage objects present in water bodies and highlights them using bounding boxes along with corresponding class labels and confidence scores. Objects such as plastic bags, leaves, and other waste materials are accurately detected in complex backgrounds. The results demonstrate the model’s capability to perform real-time garbage detection with satisfactory accuracy.
Fig. 11. Output Images
Comparative Analysis with Existing Models:
To evaluate the effectiveness of the proposed YOLOv8-based garbage detection system, a comparative analysis was conducted with another widely used object detection model, YOLOv5. Both models were trained and evaluated under similar conditions using the same dataset, training parameters, and evaluation metrics.
The comparison is based on standard performance metrics such as Precision, Recall, and mean Average Precision (mAP@0.5). The results are presented in Table
Table 1. Performance Comparison of Models
|
Model |
Precision (%) |
Recall (%) |
mAP@0.5 (%) |
|
YOLOv5 |
72.10 |
60.25 |
68.40 |
|
YOLOv8 |
79.18 |
64.49 |
74.85 |
The results indicate that the YOLOv8 model outperforms YOLOv5 across all evaluation metrics. Specifically, YOLOv8 achieves higher precision, indicating better accuracy in detecting garbage objects with fewer false positives. The improvement in recall suggests that YOLOv8 is more effective in identifying a larger number of relevant objects in the images.
Additionally, the higher mAP value demonstrates improved overall detection performance, particularly in accurately localizing objects within complex aquatic environments. The enhanced performance of YOLOv8 can be attributed to its improved architecture, better feature extraction capabilities, and optimized training strategies.
However, both models exhibit challenges in detecting objects under difficult conditions such as water reflections, occlusion, and overlapping waste materials. Despite these limitations, YOLOv8 shows more robustness in handling such scenarios compared to YOLOv5
The experimental comparison validates that the proposed approach provides improved detection performance over existing baseline models
CONCLUSION
This paper presents a YOLOv8-based approach for detecting garbage in water bodies. The proposed system automates the detection process and provides accurate results across multiple waste categories. The implementation demonstrates the effectiveness of deep learning in addressing environmental challenges.
The proposed model demonstrates improved performance compared to baseline models, validating its effectiveness for real-world applications.
FUTURE WORK
Future enhancements may include:
- Increasing dataset size for better generalization
- Use satellite images with AI for large-scale, real-time garbage detection in water bodies
- Using higher-capacity YOLO models for improved accuracy
- Integration with drone-based monitoring systems
Real-time deployment in smart city applications
REFERENCES
- J. Redmon et al., “You Only Look Once: Unified, Real-Time Object Detection,” Proc. IEEE CVPR, 2016.
- G. Jocher et al., “Ultralytics YOLOv8,” 2023. [Online]. Available: https://docs.ultralytics.com
- AquaVision: Automating the detection of waste in water bodies, Case Studies in Chemical and Environmental Engineering, 2020.
- AquaVision: AI-Powered Marine Species Identification, Information (MDPI), 2024.
- P. Proença and P. Simões, “TACO: Trash Annotations in Context for Litter Detection,” 2020.
- G. Thung and M. Yang, “TrashNet Dataset,” 2016.
- ATLANTIS: Benchmark for waterbody image segmentation, Environmental Modelling & Software, 2022.
- A. Krizhevsky et al., “ImageNet Classification with Deep CNNs,” NIPS, 2012.
- D. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” 2015.
- AquaticCLIP: Underwater scene analysis model, IEEE Transactions on Neural Networks, 2026.
Saniya Kampli*
Tejavati R. Goudar
Prerana Sadare
Shachi G. Patil
Varsha Jadhav
10.5281/zenodo.20050671