Comparative analysis of machine learning models for depression detection in social media text

Gunawardhna, M.U.K.; Shalika,  K.B.; Bandara, R.M.T.C.; Wickramaarachchi, W.A.W.U.

Comparative analysis of machine learning models for depression detection in social media text

dc.contributor.author	Gunawardhna, M.U.K.
dc.contributor.author	Shalika, K.B.
dc.contributor.author	Bandara, R.M.T.C.
dc.contributor.author	Wickramaarachchi, W.A.W.U.
dc.date.accessioned	2025-11-06T03:53:14Z
dc.date.available	2025-11-06T03:53:14Z
dc.date.issued	2025-11-07
dc.description.abstract	Depression is a prevalent mental disorder that requires early detection. However, traditional clinical methods often face challenges in identifying their onset. Social media platforms, where users frequently share personal thoughts and emotions, provide a valuable source for detecting depressive symptoms. Although research in this area is expanding, relatively few studies have examined multiple machine learning (ML) models across different feature-extraction methods. Prior work has focused mainly on Twitter based data and support vector machine (SVM) classifiers, with limited attention to other classifiers. Addressing this gap, this study evaluated three ML models, SVM, logistic regression (LR), and long short-term memory (LSTM) networks with dense layers on Facebook posts using two feature-extraction methods: term frequency-inverse document frequency (TF-IDF) and Global Vectors for Word Representation (GloVe) embeddings. A publicly available dataset was employed, and user demographics, such as age, age category and gender, were considered. Ground-truth depression labels were derived from an existing labelled dataset, where individuals diagnosed with depression were assigned a label of 1 and those without depression were assigned a label of 0. Eighty percent of the dataset was allocated to training and 20.0% to testing, ensuring class balance between depressed and non-depressed samples. Results indicated that TF-IDF with SVM achieved the highest accuracy of 95.5%, outperforming both LR and LSTM models, which each achieved an accuracy of 84.0%. This study demonstrates the effectiveness of combining SVM with TF IDF in detecting depression in Facebook text and highlights the potential of extending research beyond Twitter-based studies. The findings contribute to the literature by systematically analysing multiple models and feature-extraction techniques using previously unexplored Facebook data, thereby supporting more robust and generalizable mental health monitoring through social media.
dc.identifier.citation	Proceedings of the Postgraduate Institute of Science Research Congress (RESCON)-2025, University of Peradeniya,p81
dc.identifier.issn	3051-4622
dc.identifier.uri	https://ir.lib.pdn.ac.lk/handle/20.500.14444/6009
dc.language.iso	en
dc.publisher	Postgraduate Institute of Science (PGIS), University of Peradeniya, Sri Lanka
dc.relation.ispartofseries	Volume 12
dc.subject	Depression detection
dc.subject	Machine learning
dc.subject	Social media analysis
dc.subject	Support vector machine
dc.title	Comparative analysis of machine learning models for depression detection in social media text
dc.type	Article

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 18 RESCON 2025 CMS-33.pdf
Size:: 283.89 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

RESCON 2025