izpis_h1_title_alt

Odvisnostno površinskoskladenjsko označevanje slovenščine: specifikacije in označeni korpusi
ID Ledinek, Nina (Author), ID Erjavec, Tomaž (Author)

.pdfPDF - Presentation file, Download (427,06 KB)
MD5: B9368A4673705789AB232BC568E6D310
URLURL - Source URL, Visit https://centerslo.si/simpozij-obdobja/zborniki/obdobja-28/ This link opens in a new window

Abstract
Prispevek predstavi prve rezultate projektov JOS in SSJ s področja skladnje, in sicer nabor oznak za odvisnostno površinskoskladenjsko označevanje ter dva skladenjsko označena korpusa. Korpusa sta bila vzorčena iz referenčnega korpusa FidaPLUS ter imata ročno označene oz. pregledane leme, oblikoskladenjske ter površinskoskladenjske oznake. Viri bodo kot podatkovna zbirka na voljo za raziskovalne namene po licenci Creative Commons, namenjeni pa so zlasti razvoju jezikovnih tehnologij za slovenščino.

Language:Slovenian
Keywords:skladenjsko označavenje, korpusi slovenskega jezika, Creative Commons
Work type:Article
Typology:1.08 - Published Scientific Conference Contribution
Organization:FF - Faculty of Arts
Year:2009
Number of pages:Str. 219-224
PID:20.500.12556/RUL-150902 This link opens in a new window
UDC:821.163.6;367:811.163.6'322
COBISS.SI-ID:30665261 This link opens in a new window
Publication date in RUL:25.09.2023
Views:507
Downloads:31
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Record is a part of a monograph

Title:Infrastruktura slovenščine in slovenistike
Editors:Marko Stabej
Place of publishing:Ljubljana
Publisher:Znanstvena založba Filozofske fakultete
Year:2009
ISBN:978-961-237-333-7
COBISS.SI-ID:248431360 This link opens in a new window
Collection title:Obdobja
Collection numbering:28

Secondary language

Language:English
Abstract:
The paper introduces the first results of the JOS and SSJ projects from the area of syntax, comprising the framework for surface dependency annotation of Slovene texts and two annotated corpora. The corpora have been sampled from the Slovene reference corpus FidaPLUS and contain hand validated lemmas, morphosyntactic and surface-syntactic annotations. These resources will be made available as downloadable datasets under a Creative Commons licence, targeted primarily at language technology research for Slovene.

Keywords:syntactic annotation, Slovene corpora, Creative Commons

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back