<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"><rdf:Description rdf:about="https://repozitorij.uni-lj.si/IzpisGradiva.php?id=160384"><dc:title>Sarcasm detection with transfer learning from multiple sources</dc:title><dc:creator>Đoković,	Lazar	(Avtor)
	</dc:creator><dc:creator>Robnik Šikonja,	Marko	(Mentor)
	</dc:creator><dc:subject>natural language processing</dc:subject><dc:subject>large language models</dc:subject><dc:subject>sarcasm detection</dc:subject><dc:subject>neural machine translation</dc:subject><dc:subject>BERT model</dc:subject><dc:subject>GPT model</dc:subject><dc:subject>Llama model</dc:subject><dc:description>Sarcasm detection is a natural language processing task of classifying whether an utterance is sarcastic or not. It is closely related to sentiment analysis since it often inverts surface sentiment. Despite the great interest and research done by the sentiment analysis community, it remains a challenging problem. This is because sarcastic sentences are highly dependent on context, and they are often accompanied by various non-verbal cues. Recent work in sarcasm detection mostly focuses on the Transformer architecture of neural networks and its application in high-resourced languages like English. To build a sarcasm detection dataset for Slovene, we leverage two modern techniques in machine translation and language modeling. The first approach uses a medium-size Transformer model trained specifically for neural machine translation, while the second method utilizes a very large generative model. We explore the viability of such datasets and how the size of a pretrained Transformer affects its ability to detect sarcasm. We use this data to train ensembles of Transformer-based models. We evaluate model performance using established methodologies. Our results show that larger models generally outperform smaller ones, and that ensembling can slightly improve sarcasm detection performance. Our best ensemble approach achieves an $\text{F}_1$-score of 0.765.</dc:description><dc:date>2024</dc:date><dc:date>2024-08-27 14:10:01</dc:date><dc:type>Diplomsko delo/naloga</dc:type><dc:identifier>160384</dc:identifier><dc:language>sl</dc:language></rdf:Description></rdf:RDF>
