izpis_h1_title_alt

Building and using comparable corpora for domain-specific bilingual lexicon extraction
Fišer, Darja (Avtor), Ljubešić, Nikola (Avtor), Vintar, Špela (Avtor), Pollak, Senja (Avtor)

URLURL - Predstavitvena datoteka, za dostop obiščite http://aclweb.org/anthology-new/W/W11/W11-12.pdf Povezava se odpre v novem oknu

Izvleček
This paper presents a series of experiments aimed at inducing and evaluating domainspecific bilingual lexica from comparable corpora. First, a small English-Slovene comparable corpus from health magazines was manually constructed and then used to compile a large comparable corpus on health-related topics from web corpora. Next, a bilingual lexicon for the domain was extracted from the corpus by comparing context vectors in the two languages. Evaluation of the results shows that a 2-way translation of contextvectors significantly improves precision of the extracted translation equivalents. We also show that it is sufficient to increase the corpus for onelanguage in order to obtain a higher recall, and that the increase of the number of new words is linear in the size of the corpus. Finally, we demonstrate that by lowering the frequency threshold for context vectors, the drop in precision is much slower than the increase of recall.

Jezik:Angleški jezik
Vrsta gradiva:Delo ni kategorizirano (r6)
Tipologija:1.08 - Objavljeni znanstveni prispevek na konferenci
Organizacija:FF - Filozofska fakulteta
Leto izida:2011
Št. strani:Str. 19-26
UDK:81'322
COBISS.SI-ID:46847586 Povezava se odpre v novem oknu
Število ogledov:393
Število prenosov:95
Metapodatki:XML RDF-CHPDL DC-XML DC-RDF
 
Skupna ocena:(0 glasov)
Vaša ocena:Ocenjevanje je dovoljeno samo prijavljenim uporabnikom.
:
Objavi na:AddThis
AddThis uporablja piškotke, za katere potrebujemo vaše privoljenje.
Uredi privoljenje...

Podobna dela

Podobna dela v RUL:
Podobna dela v drugih slovenskih zbirkah:

Komentarji

Dodaj komentar

Za komentiranje se morate prijaviti.

Komentarji (0)
0 - 0 / 0
 
Ni komentarjev!

Nazaj