Podrobno

Lexical diversity in statistical and neural machine translation
ID Brglez, Mojca (Avtor), ID Vintar, Špela (Avtor)

.pdfPDF - Predstavitvena datoteka, prenos (343,69 KB)
MD5: F4D87D51688865324A5170F8B8C3EFF9
URLURL - Izvorni URL, za dostop obiščite https://www.mdpi.com/2078-2489/13/2/93 Povezava se odpre v novem oknu

Izvleček
Neural machine translation systems have revolutionized translation processes in terms of quantity and speed in recent years, and they have even been claimed to achieve human parity. However, the quality of their output has also raised serious doubts and concerns, such as loss in lexical variation, evidence of “machine translationese”, and its effect on post-editing, which results in “post-editese”. In this study, we analyze the outputs of three English to Slovenian machine translation systems in terms of lexical diversity in three different genres. Using both quantitative and qualitative methods, we analyze one statistical and two neural systems, and we compare them to a human reference translation. Our quantitative analyses based on lexical diversity metrics show diverging results; however, translation systems, particularly neural ones, mostly exhibit larger lexical diversity than their human counterparts. Nevertheless, a qualitative method shows that these quantitative results are not always a reliable tool to assess true lexical diversity and that a lot of lexical “creativity”, especially by neural translation systems, is often unreliable, inconsistent, and misguided.

Jezik:Angleški jezik
Ključne besede:machine translation, neural translation systems, lexical diversity, type-token ratio, measure of textual lexical diversity
Vrsta gradiva:Članek v reviji
Tipologija:1.01 - Izvirni znanstveni članek
Organizacija:FF - Filozofska fakulteta
Status publikacije:Objavljeno
Različica publikacije:Objavljena publikacija
Leto izida:2022
Št. strani:14 str.
Številčenje:Vol. 13, iss. 2, art. 93
PID:20.500.12556/RUL-137294 Povezava se odpre v novem oknu
UDK:81\'25\'322.4
ISSN pri članku:2078-2489
DOI:10.3390/info13020093 Povezava se odpre v novem oknu
COBISS.SI-ID:100548099 Povezava se odpre v novem oknu
Datum objave v RUL:09.06.2022
Število ogledov:1178
Število prenosov:132
Metapodatki:XML DC-XML DC-RDF
:
BRGLEZ, Mojca in VINTAR, Špela, 2022, Lexical diversity in statistical and neural machine translation. Information [na spletu]. 2022. Vol. 13, no. 2,  93. [Dostopano 30 marec 2025]. DOI 10.3390/info13020093. Pridobljeno s: https://repozitorij.uni-lj.si/IzpisGradiva.php?lang=slv&id=137294
Kopiraj citat
Objavi na:Bookmark and Share

Gradivo je del revije

Naslov:Information
Skrajšan naslov:Information
Založnik:MDPI
ISSN:2078-2489
COBISS.SI-ID:18497046 Povezava se odpre v novem oknu

Licence

Licenca:CC BY 4.0, Creative Commons Priznanje avtorstva 4.0 Mednarodna
Povezava:http://creativecommons.org/licenses/by/4.0/deed.sl
Opis:To je standardna licenca Creative Commons, ki daje uporabnikom največ možnosti za nadaljnjo uporabo dela, pri čemer morajo navesti avtorja.
Začetek licenciranja:15.02.2022

Sekundarni jezik

Jezik:Slovenski jezik
Ključne besede:strojno prevajanje, nevronski prevajalniki, leksikalna diverziteta, razmerje različnic in pojavnic, merjenje besedilne leksikalne diverzitete

Projekti

Financer:ARRS - Agencija za raziskovalno dejavnost Republike Slovenije
Številka projekta:P6-0215
Naslov:Slovenski jezik - bazične, kontrastivne in aplikativne raziskave

Podobna dela

Podobna dela v RUL:
  1. Neural machine translation of literary texts from English to Slovene
  2. Comparative analysis of machine translations of personal and geographical names in the Game of thrones series
  3. Analysis of machine translation of terminology in English and Slovenian texts
  4. Analysis of Cognitive Effort in Translation and Post-Editing of Machine Translation
  5. Jezik in umetna inteligenca
Podobna dela v drugih slovenskih zbirkah:
  1. Setting up a machine translation system based on neural networks
  2. On automatic machine translation evaluation
  3. Creating a Rule-based shallow transfer machine translation system for the Slovenian-Macedonian language pair
  4. Production of machine translation system based on shallow transfer rules for the slovenian - croatian language pair
  5. Analysis of online machine translation services

Nazaj