We use cookies to ensure our website works properly and to personalise your experience. Cookies policy
Department of computer science and engineering, Paavai Engineering College, Paavai Institutions, Paavai Nagar, NH-7, Pachal, Namakkal-637018, Tamilnadu, India.
Cyberbullying has become a major concern for social networking sites and affects the psychological state of people, especially teenagers. However, conventional approaches for detecting cyberbullying rely on analyzing individual posts without considering the interaction context in a session. This study develops a novel cyberbullying detection system based on a Heterogeneous Graph Neural Network (HGNN). Social networking site sessions are represented as heterogeneous graphs that include relationships between posts, users, and interactions. The proposed approach involves a feature encoder, a heterogeneous graph encoder, and a classifier. The experimental analysis shows impressive results with an F1-score of 97.26.
Social media sites produce a significant amount of unstructured text-based content that provides a lot of information regarding user activity and trends in society. One of the areas under Natural Language Processing (NLP) that plays a crucial part in this regard is sentiment analysis.
Cyberbullying detection is increasingly becoming important due to the fast-growing trend of social interaction over the internet. Most methods used are unable to recognize the context dependency of users’ activities during a single interaction period. This paper aims at analyzing social media sessions through graph-based modeling techniques.
LITERATURE SURVEY
Several studies have explored cyberbullying detection using machine learning and deep learning techniques:
Despite these advancements, challenges such as data imbalance, computational cost, and contextual understanding remain unresolved.
PROBLEM STATEMENT
The rapid growth of social media platforms has significantly increased the volume of user-generated content, creating new challenges in identifying harmful behaviors such as cyberbullying. Cyberbullying is not limited to a single post or comment; rather, it typically occurs through a sequence of interactions involving multiple users, including replies, comments, and reactions over time. Despite the availability of large-scale data, accurately detecting cyberbullying remains a complex task due to the dynamic and contextual nature of online communication. Most existing cyberbullying detection approaches treat the problem as a binary text classification task, where individual posts are independently classified as bullying or non-bullying. While such methods are computationally efficient, they fail to capture the broader conversational context and interaction patterns that are essential for understanding the intent and severity of cyberbullying. For instance, a single comment may appear harmless in isolation but may become harmful when analyzed within a sequence of repeated or targeted interaction. Another major limitation of existing approaches is their inability to model interdependencies among different components of a social media session, such as users, posts, timestamps, and relationships between participants. Traditional machine learning and deep learning models, including CNNs and LSTMs, primarily focus on sequential or textual features and often ignore the structural and relational information inherent in social media data. As a result, these models struggle to capture higher-order semantic relationships and interaction patterns. Furthermore, cyberbullying detection faces several practical challenges, including: • Contextual ambiguity: The same word or phrase may convey different meanings depending on the context, sarcasm, or cultural background. • Imbalanced datasets: Cyberbullying instances are relatively rare compared to normal interactions, leading to biased model performance. • Dynamic interaction patterns: Social media conversations evolve over time, making it difficult to capture temporal dependencies. • Multimodal complexity: Cyberbullying may involve not only text but also images, videos, and emojis, which are often ignored in traditional models. • Scalability issues: Handling large-scale, real-time social media data requires efficient and scalable solutions.
PROPOSED SYSTEM
To address the limitations of existing cyberbullying detection approaches, this work proposes a session-based cyberbullying detection framework using a Heterogeneous Graph Neural Network (HGNN). The core idea is to model social media sessions as heterogeneous graphs, enabling the system to capture both content-level features and structural relationships among different entities such as users, posts, and interactions.
A. Overall Framework
The proposed framework consists of three major components:
The complete workflow is illustrated as a pipeline where raw social media data is transformed into graph representations and processed through a deep learning model for classification.
B. Data Representation and Session Modeling
A social media session is defined as a collection of:
Instead of treating each post independently, the session is modeled as a heterogeneous graph G=(V,E,T)G = (V, E, T)G=(V,E,T), where:
EXPERIMENTAL RESULTS AND DISCUSSION
The proposed Heterogeneous Graph Neural Network (HGNN) model was evaluated using real-world social media datasets to assess its effectiveness in cyberbullying detection. The performance of the model was measured using standard evaluation metrics such as accuracy, precision, recall, and F1-score.
SYSTEM ARCHITECTURE
The proposed system architecture integrates data preprocessing, feature extraction, graph modeling, and classification into a unified framework for session-based cyberbullying detection.
Initially, raw social media data, including posts, comments, and user interactions, is collected and preprocessed by removing noise, tokenizing text, and applying stemming techniques.The processed data is then transformed into numerical representations using feature extraction methods such as TF-IDF and word embeddings. Subsequently, each social media session is modeled as a heterogeneous graph, where nodes represent users, posts, and comments, and edges represent their interactions.
The constructed graph is processed using a Heterogeneous Graph Neural Network (HGNN), which captures both semantic and structural relationships through message passing mechanisms. The learned node embeddings are aggregated to form a session-level representation. Finally, the session representation is fed into a classifier to predict whether the session contains cyberbullying or not. This architecture effectively combines textual and relational information, leading to improved detection performance.
Experimental results demonstrate that the proposed approach achieves superior performance compared to traditional machine learning and deep learning models. In particular, the model achieved an F1-score of 97.26% on the Vine dataset, indicating its strong capability in identifying cyberbullying instances. Additionally, the model showed competitive performance on the Instagram dataset, confirming its generalization ability across different platforms.
The results highlight the effectiveness of modeling social media sessions as heterogeneous graphs. By capturing both textual features and interaction-based relationships, the proposed model is able to identify complex cyberbullying patterns that are often missed by conventional approaches. Furthermore, the incorporation of graph-based learning improves contextual understanding and reduces false classifications.
Overall, the experimental findings validate that the proposed framework significantly enhances detection accuracy and robustness, making it suitable for real-world cyberbullying detection applications.
TESTING
The system was tested using:
All tests passed successfully with no major defects observed.
CONCLUSION
This paper presents a novel approach for cyberbullying detection using heterogeneous graph neural networks. By modeling social media sessions as graphs, the system effectively captures contextual and relational information. The proposed model significantly outperforms traditional approaches in terms of accuracy and robustness.
FUTURE WORK
Future work includes integration with real-time systems, use o Future enhancements include:
Using demographic and temporal features
REFERENCES
M. Manimaran*, S. SivaRanjani, Heterogeneous Graph Neural Network Framework For Session-Based Cyberbullying Detection, Int. J. Sci. R. Tech., 2026, 3 (5), 496-499. https://doi.org/10.5281/zenodo.20157670
10.5281/zenodo.20157670