[2602.22449] A Fusion of context-aware based BanglaBERT and Two-Layer Stacked LSTM Framework for Multi-Label Cyberbullying Detection
Summary
This paper presents a novel framework combining BanglaBERT and a two-layer stacked LSTM for effective multi-label cyberbullying detection in Bangla, addressing the limitations of existing models.
Why It Matters
Cyberbullying poses significant social and mental health challenges, particularly in low-resource languages like Bangla. This research contributes to the field by developing a robust detection model that accommodates multiple forms of abuse, enhancing the understanding and management of online harassment.
Key Takeaways
- The proposed model integrates BanglaBERT with LSTM to improve context and sequence understanding.
- It addresses the challenge of multi-label classification in cyberbullying, which is often overlooked.
- The framework is evaluated using various metrics, ensuring comprehensive performance assessment.
- Class imbalance is tackled through different sampling strategies, enhancing model reliability.
- The study emphasizes the importance of developing resources for low-resource languages in AI.
Computer Science > Computation and Language arXiv:2602.22449 (cs) [Submitted on 25 Feb 2026] Title:A Fusion of context-aware based BanglaBERT and Two-Layer Stacked LSTM Framework for Multi-Label Cyberbullying Detection Authors:Mirza Raquib, Asif Pervez Polok, Kedar Nath Biswas, Rahat Uddin Azad, Saydul Akbar Murad, Nick Rahimi View a PDF of the paper titled A Fusion of context-aware based BanglaBERT and Two-Layer Stacked LSTM Framework for Multi-Label Cyberbullying Detection, by Mirza Raquib and 5 other authors View PDF HTML (experimental) Abstract:Cyberbullying has become a serious and growing concern in todays virtual world. When left unnoticed, it can have adverse consequences for social and mental health. Researchers have explored various types of cyberbullying, but most approaches use single-label classification, assuming that each comment contains only one type of abuse. In reality, a single comment may include overlapping forms such as threats, hate speech, and harassment. Therefore, multilabel detection is both realistic and essential. However, multilabel cyberbullying detection has received limited attention, especially in low-resource languages like Bangla, where robust pre-trained models are scarce. Developing a generalized model with moderate accuracy remains challenging. Transformers offer strong contextual understanding but may miss sequential dependencies, while LSTM models capture temporal flow but lack semantic depth. To address these limitations, we propo...