The advancement of smart environments and intelligent healthcare systems has significantly increased the demand for reliable and continuous human safety monitoring solutions. Conventional monitoring approaches predominantly rely on manual supervision or isolated sensor-based systems, which often suffer from limitations such as inefficiency, susceptibility to human error, and delayed response to emergency situations. These challenges become more pronounced in critical scenarios involving elderly individuals, patients undergoing rehabilitation, and personnel operating in high-risk environments, where timely detection and immediate intervention are essential to prevent severe consequences.
To address these limitations, the proposed system introduces an artificial intelligence–driven safety monitoring framework that integrates Human Action Recognition and Facial Expression Recognition. By combining physical activity analysis with facial behavior interpretation, the system provides a comprehensive assessment of an individual’s physical and emotional state. This integrated, multi-model approach enhances the accuracy and reliability of emergency detection, enabling the system to identify critical conditions such as falls, drowsiness, and abnormal behavior more effectively than traditional single-source monitoring systems.
The Human Action Recognition component is designed to identify and classify common human physical activities, including walking, running, sitting, standing, lying down, and climbing stairs. Using motion or video data acquired through cameras or sensing devices, the system continuously analyzes movement patterns to distinguish normal activities from potentially hazardous events. Fall detection is treated as a high-priority condition due to its serious implications, particularly for elderly individuals and patients with limited mobility. Upon detecting a fall, the system initiates an automated emergency response by generating and transmitting an alert email to designated contacts. The alert contains critical information such as the nature of the incident, the precise date and time of occurrence, and the user’s current location in the form of GPS coordinates or a physical address, thereby facilitating rapid assistance and minimizing response time.
PROBLEM STATEMENT
Safety monitoring in environments such as elderly care, healthcare, and surveillance systems continues to rely on manual supervision or single-source solutions, which are often inefficient, error-prone, and unable to respond promptly to emergencies. Existing systems typically fail to analyze both physical activities and emotional conditions simultaneously, leading to delayed detection and increased risk. There is a need for an intelligent, automated, and multi-source monitoring system capable of real-time emergency detection and notification. The AI- Guardian project addresses this gap by integrating Human Action Recognition and Facial Expression Recognition into a unified AI-based safety framework.
OBJECTIVES
• To develop an intelligent Al-based system for real-time human activity recognition.
• To accurately detect critical events such as falls using human action recognition techniques.
• To detect drowsiness through facial cues like eye closure and head movement patterns.
• To integrate human action recognition and facial expression recognition into a unified safety monitoring framework.
• To automatically generate and send emergency email alerts with incident and location details
LITERATURE REVIEWS
|
Authors |
Year |
Title / Focus |
Merits |
Remarks |
|
Haresamudram et al. |
2020 |
Contrastive Predictive Coding for HAR |
Improves accuracy even with limited labeled data using self- supervised learning |
Suitable for scenarios with scarce labeled datasets |
|
Abdel-Salam et al. |
2021 |
HAR using Wearable Sensors: Review & Benchmark |
Provides comprehensive evaluation benchmark and improves performance across datasets |
Useful for comparing different HAR techniques |
|
Beddiar et al. |
2022 |
Deep Learning- based HAR Review |
Highlights effectiveness of deep learning models in HAR |
Indicates shift towards deep learning approaches |
|
Zhou et al. |
2023 |
HAR in Smart Living Environments |
Optimized models for low- resource and real- time smart systems |
Focuses on efficiency and lightweight implementations |
|
Li et al. |
2024 |
Multimodal HAR using Vision & Sensors |
Achieves higher accuracy through multimodal data fusion |
More complex but highly effective compared to single- modal systems |
METHODOLOGY
The proposed AI-Guardian system uses a multi-stage approach combining Human Activity Recognition (HAR) and Facial Expression Recognition (FER) for real-time safety monitoring. Video input is captured through a camera and preprocessed using resizing, normalization, and noise reduction techniques. The HAR module applies pose estimation to detect activities such as walking, sitting, and fall events based on body movements. Simultaneously, the FER module uses deep learning models to recognize facial expressions and detect drowsiness through eye closure and head movement. The outputs from both modules are analyzed in a risk assessment system that classifies events as normal, warning, or critical. The backend, developed using Flask, handles data processing and real-time communication. When a critical event is detected, the system automatically sends an alert with location and incident details. This integrated approach improves accuracy, reduces response time, and ensures efficient safety monitoring.
DESIGN
1. Context diagram
2. Use case diagram
3. Sequence diagram
RESULTS AND DISCUSSION
The AI-Guardian: Multi-Source Safety Monitoring System was successfully implemented and tested under real-time conditions to evaluate its effectiveness in detecting emergency situations and ensuring reliable safety monitoring. The system demonstrated accurate recognition of human activities such as walking, running, sitting, standing, lying down, and climbing stairs using Human Action Recognition techniques. Fall detection was observed to be highly reliable, with the system accurately identifying sudden posture changes and abnormal motion patterns while minimizing false alarms through temporal consistency checks.
The Facial Expression Recognition module effectively classified emotional states including happy, sad, angry, fear, surprise, disgust, and neutral expressions. In addition, the drowsiness detection mechanism successfully identified fatigue-related conditions based on prolonged eye closure and head movement patterns. The combined analysis of physical activity and facial behavior significantly enhanced situational awareness and improved the overall reliability of emergency detection compared to single-source monitoring systems.
During system testing, emergency scenarios such as simulated falls and drowsiness events triggered timely alert generation. The backend services correctly processed these events, and emergency email notifications were delivered successfully with accurate incident details, date and time, and user location information. Real-time updates on the frontend dashboard ensured immediate visibility of critical events, enabling rapid response by caregivers or authorities.
The system maintained stable performance during continuous operation, demonstrating robustness and suitability for long-term deployment. While minor latency was observed during high computational load, particularly in video processing, this can be further optimized through model optimization and hardware acceleration. Overall, the results confirm that the proposed AI-Guardian system effectively enhances safety monitoring by integrating Human Action Recognition and Facial Expression Recognition, reducing manual supervision, improving emergency response time, and providing a reliable solution for applications in healthcare, elderly care, rehabilitation, and smart surveillance environments.
Fig10.1:UI and interface
Fig10.2: AI Guardian+ User Authentication(Login Page) interface
Fig10.3: AI Guardian+ User Registration Interface
Fig10.4: System Alerts and Notification Management Interface
Fig10.5: Live Activity Recognition and Pose Estimation View
Fig10.6: Activity and Fall Detection Monitoring Dashboard
Fig10.7: Real-Time Facial Distress Detection Dashboard
Fig10.8: Facial Expression Probability and Distress State Analysis
Fig10.9: AI Guardian+ System Landing Page
CONCLUSION
The AI-Guardian: Multi-Source Safety Monitoring System successfully demonstrates the effective use of artificial intelligence to enhance human safety through real-time monitoring and automated emergency detection. By integrating Human Action Recognition and Facial Expression Recognition, the system provides a comprehensive understanding of both physical activities and emotional states, enabling accurate detection of critical events such as falls and drowsiness. The system operates continuously with minimal human intervention, ensuring timely alert generation and rapid response through automated email notifications containing essential incident and location details. Experimental results confirm the reliability, accuracy, and stability of the system under real-time conditions, making it suitable for applications in elderly care, healthcare monitoring, rehabilitation systems, and smart surveillance environments. Overall, the proposed solution reduces dependence on manual supervision, improves emergency response efficiency, and enhances overall safety and quality of care.
REFERENCES
- Haresamudram, H., Saha, S., Mukherjee, A., & Kumar, A. (2020). Contrastive Predictive Coding for Human Activity Recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(3), 1–22.
- Abdel-Salam, A., Gaber, T., Hassanien, A. E., & Tolba, M. F. (2021). Human Activity Recognition using Wearable Sensors: Review, Challenges, and Evaluation Benchmark. Sensors, 21(2), 1–29.
- Beddiar, D. R., Nini, B., Sabokrou, M., & Hadid, A. (2022). A Review of Deep Learning- Based Human Activity Recognition on Benchmark Video Datasets. IEEE Access, 10, 24533–24557.
- Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., & Zhang, J. (2023). Review on Human Action Recognition in Smart Living: Sensing Technology, Multimodality, Real-Time Processing, Interoperability, and Resource-Constrained Processing. IEEE Internet of Things Journal, 10(3), 2015–2032.
- Li, Y., Wang, J., Chen, H., & Zhang, X. (2024). Multimodal Human Activity Recognition Using Vision and Wearable Sensors. Sensors, 24(5), 1–18.
- Poria, S., Cambria, E., Bajpai, R., & Hussain, A. (2017). A Review of Affective Computing: From Unimodal Analysis to Multimodal Fusion. Information Fusion, 37, 98–125.
- Redmon, J., & Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv preprint arXiv:1804.02767.
- Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large- Scale Image Recognition. International Conference on Learning Representations (ICLR).
- OpenCV Documentation. (2023). Open Source Computer Vision Library.
M Manoj Kumar*
Chidananda
Makam Surendra
M Harsha
C Nikhil Sami
10.5281/zenodo.20032657