Your browser does not allow JavaScript!
JavaScript is necessary for the proper functioning of this website. Please enable JavaScript or use a modern browser.
Repository of the University of Ljubljana
Open Science Slovenia
Open Science
DiKUL
slv
|
eng
Search
Browse
New in RUL
About RUL
In numbers
Help
Sign in
Details
Statistično strojno prevajanje iz angleščine v slovenščino s sistemom Moses
ID
KUNTARIČ, SAŠO
(
Author
),
ID
Robnik Šikonja, Marko
(
Mentor
)
More about this mentor...
,
ID
Krek, Simon
(
Comentor
)
PDF - Presentation file,
Download
(1,50 MB)
MD5: 45D570452B16920D350A67BB8B366964
PID:
20.500.12556/rul/d619a15f-fcd8-45b3-b8ae-f04bb074b026
Image galllery
Abstract
Cilj diplomske naloge je prilagoditev sistema Moses za statistično strojno prevajanje iz angleščine v slovenščino. Strojno prevajanje je področje računalniške lingvistike, ki raziskuje uporabo programske opreme za prevajanje besedila iz enega jezika v drugega. Faktorsko statistično strojno prevajanje je podaljšek statističnega, pri katerem besedilu dodamo jezikovne oznake na ravni besed in jih spremenimo v vektorje. Tako želimo izboljšati kakovost dobljenih prevodov. Za odprtokodni prevajalnik Moses smo iz jezikovnega korpusa z besedili s področja informacijskih tehnologij ustvarili več faktorskih jezikovnih in prevajalnih modelov. Z njimi smo prevedli dve besedili s področja informacijskih tehnologij. Prvo je usmerjeno tržno in ima kompleksnejše zgradbo, drugo pa je bolj tehnične narave. Prevode, ki smo jih dobili, smo na dva načina primerjali med seboj ter z dvema neodvisnima človeškima prevodoma in s prevodom, ki smo ga ustvarili s storitvijo Google Translate. Za prvo primerjavo smo uporabili algoritem BLEU, za drugo pa so prevode pregledali človeški pregledovalci in podali subjektivno oceno, ki je pri prevajanju še vedno zelo pomembna. V zaključku smo si ogledali zanesljivost ocenjevalcev in analizirali rezultate ocenjevanja. Ugotovili smo, da so naši modeli primernejši za tehnična besedila, prehod na faktorske modele pa bolj vpliva na prevajanje kompleksnejših besedil.
Language:
Slovenian
Keywords:
statistično strojno prevajanje
,
faktorsko strojno prevajanje
,
sistem Moses
,
jezikovni korpus
,
jezikovni model
,
prevajalni model
,
BLEU
,
evalvacija
,
Google Translate
Work type:
Undergraduate thesis
Organization:
FRI - Faculty of Computer and Information Science
Year:
2016
PID:
20.500.12556/RUL-91211
Publication date in RUL:
24.03.2017
Views:
3494
Downloads:
468
Metadata:
Cite this work
Plain text
BibTeX
EndNote XML
EndNote/Refer
RIS
ABNT
ACM Ref
AMA
APA
Chicago 17th Author-Date
Harvard
IEEE
ISO 690
MLA
Vancouver
:
KUNTARIČ, SAŠO, 2016,
Statistično strojno prevajanje iz angleščine v slovenščino s sistemom Moses
[online]. Bachelor’s thesis. [Accessed 26 March 2025]. Retrieved from: https://repozitorij.uni-lj.si/IzpisGradiva.php?lang=eng&id=91211
Copy citation
Share:
Secondary language
Language:
English
Title:
Statistical machine translation from English to Slovene using Moses system
Abstract:
The aim of the thesis is to customise the Moses system for statistical machine translation from English to Slovenian. Machine translation is a field in computational linguistics that explores the use of software to translate text from one language to another. Factorised statistical translation is an extension of statistical machine translation, where language tags are added on the word level. Words are turned into vectors in an attempt to improve the translation quality. For the open-source machine translation system Moses we created multiple factorised language and translation models from a language corpus, containing IT-related texts. We translated two different IT-based documents. First one was marketing-orientated with a complex structure, while the second one was technical and straight-forward. We used two methods to compare the generated translations, two independent human translations and a translation, created by the Google Translate service. In the first comparison we used the algorithm BLEU and in the second comparison the translations were marked by human reviewers, who expressed a subjective score, which is very important in the translation field. In conclusion we calculated the inter-rater coherence and analysed the results. We discovered that our models were more suitable for technical texts, however switching to factorised models affects complex texts more.
Keywords:
statistical machine translation
,
factorised machine translation
,
Moses system
,
language corpus
,
language model
,
translation model
,
BLEU
,
human evaluation
,
Google Translate
Similar documents
Similar works from RUL:
Standardi kakovosti v NVO
Interno komuniciranje v UKC Ljubljana
Instagram kot osrednji tržnokomunikacijski kanal
Sonaravni razvoj kraja kot dolgoročna politična usmeritev
Internationalisation and quality assurance in higher education in Slovenia and the Netherlands
Similar works from other Slovenian collections:
Pravni vidik spremljanja zaposljivosti diplomantov v povezavi s kakovostjo slovenskega visokega šolstva
Quality evaluation information support in higher education
Odličnost voditeljstva v visokem šolstvu
Razmišljanje o vprašanju jezika v visokem šolstvu
Izzivi e-izobraževanja v visokem šolstvu
Back