Your browser does not allow JavaScript!
JavaScript is necessary for the proper functioning of this website. Please enable JavaScript or use a modern browser.
Repository of the University of Ljubljana
Open Science Slovenia
Open Science
DiKUL
slv
|
eng
Search
Browse
New in RUL
About RUL
In numbers
Help
Sign in
Details
Poenostavljanje besedil v slovenščini z velikimi jezikovnimi modeli
ID
Bone, Blaž
(
Author
),
ID
Robnik Šikonja, Marko
(
Mentor
)
More about this mentor...
PDF - Presentation file,
Download
(403,86 KB)
MD5: 47DD92C39863622A81B991D603A9A146
Image galllery
Abstract
V diplomski nalogi smo raziskali poenostavljanje besedil v slovenščini z uporabo velikih jezikovnih modelov. Cilj naloge je bil razviti modele, ki lahko učinkovito poenostavijo slovenska besedila. Uporabili smo obstoječe angleške učne množice, jih strojno prevedli v slovenščino, nato na teh podatkih naučili modele, kot so SloT5, mT5 in mBART. Izvedli smo kvantitativno in kvalitativno analizo rezultatov, pri čemer smo uporabili metrike, kot so BLEU, SARI, BERTScore in LaBSE Similarity. Rezultati so pokazali, da so modeli uspešno poenostavili besedila, ohranili ključne informacije in smiselno poenostavili strukturo in jezik. Kljub uspešnim poenostavitvam so modeli pogosto ponovili izvirne povedi brez večjih sprememb.
Language:
Slovenian
Keywords:
obdelava naravnega jezika
,
poenostavljanje besedila
,
strojno učenje
,
veliki jezikovni modeli
Work type:
Bachelor thesis/paper
Typology:
2.11 - Undergraduate Thesis
Organization:
FRI - Faculty of Computer and Information Science
Year:
2024
PID:
20.500.12556/RUL-160702
COBISS.SI-ID:
209156355
Publication date in RUL:
03.09.2024
Views:
399
Downloads:
54
Metadata:
Cite this work
Plain text
BibTeX
EndNote XML
EndNote/Refer
RIS
ABNT
ACM Ref
AMA
APA
Chicago 17th Author-Date
Harvard
IEEE
ISO 690
MLA
Vancouver
:
BONE, Blaž, 2024,
Poenostavljanje besedil v slovenščini z velikimi jezikovnimi modeli
[online]. Bachelor’s thesis. [Accessed 5 April 2025]. Retrieved from: https://repozitorij.uni-lj.si/IzpisGradiva.php?lang=eng&id=160702
Copy citation
Share:
Secondary language
Language:
English
Title:
Text simplification for Slovene using large language models
Abstract:
In this thesis, we explored text simplification in Slovene using large language models. The goal of the thesis was to develop models that can effectively simplify Slovene texts. We used existing English training datasets, which we machine-translated into Slovene, and then trained models such as SloT5, mT5, and mBART on these data. We conducted quantitative and qualitative analysis of the results, using metrics such as BLEU, SARI, BERTScore, and LaBSE Similarity. The results showed that the models can successfully simplify texts, retain key information, and meaningfully simplify the structure and language. Despite the successful simplifications, the models often repeat the original sentences without significant changes.
Keywords:
natural language processing
,
text simplification
,
machine learning
,
large language models
Similar documents
Similar works from RUL:
Primerjava avtomatiziranih ter klasičnih metod tehničnega opazovanja inženirskih objektov
MAG welding using seam tracking systems
High-speed camera modal updating
Finger vibration analysis using a high speed camera
Simplified parallel parking system
Similar works from other Slovenian collections:
SECURITY SYSTEMS IN BUILDINGS
Spin-coating for optical-oxigen-sensor preparation
Negotovost meritve pri merjenju s CCD kamero Renishaw VP2
Signal processor for optical fiber sensors based on MEMS Fabry-Perot interferometer
Optično merjenje predmetov
Back