The increasing significance of online communication in contemporary society has underlined
the need to understand and identify suicidal ideation within these online spaces. Online
communities, especially those centered on mental health, frequently feature communications
deeply interwoven with expressions of suicidal ideation. While detecting these expressions is
important for research, it is also fundamental for proactive moderation and prevention
strategies within these platforms. Traditional machine learning methodologies have shown
promise in recognizing suicidal tendencies in textual data. However, the emergence of large
language models (LLM’s) like GPT-4, built on sophisticated deep learning architectures, offers
potential for a deeper and more nuanced detection of subtle cues linked with suicidal ideation
that are often mingled with other themes and difficult to isolate. The core focus of this research
is to examine the capability of LLM's in detecting suicidal content in online content. The
objectives include 1) embedding the texts and clustering them based on content similarity, and
2) fine-tuning the models to distinguish and categorize documents based on the presence of
genuine suicidal ideation versus general mental health discussions. The results validate the
efficacy of LLMs in both tasks, achieving successful clustering of posts based on their content
similarities to generate class labels, as well as having high precision and recall in differentiating
suicidal ideation from general mental health narratives.
|