izpis_h1_title_alt

Računalniške zbirke besedil
Erjavec, Tomaž (Author)

URLURL - Presentation file, Visit http://www.dlib.si/details/URN:NBN:SI:doc-COYZ9S0I This link opens in a new window

Abstract
Urejene računalniške zbirke besedil - korupusi - postajajo nepogrešljiv vir jezikovnih podatkov. Za slovenščino javno dostopnih korpusov še ni. Članek podaja zgodovinski pregled razvoja računalniških korpusov, njihovo tipologijo in področja uporabe. Podrobneje spregovori o dveh vprašanjih: standardizaciji zapisovanja ter orodnjih za njihovo razvijanje in izkoriščanje. Drugi del članka je posvečen projektu MULTEXT-East (Multilingual Text Tools and Copora for Central and Eastern European Languages; Večjezična besedilna orodja in korpusi za srednje- in vzhodnoevropske jezike), ki vključuje tudi slovenščino.Največ pozornosti namenja predstavitvi korpusa in oblikoslovnih in skladenjskih opisov, razvitih v okviru projektov, ter trenutno dostopnim rezultatom. V zaključnem delu spregovori o nekaterih možnostih za razvoj korpusnega jezikoslovja v Sloveniji.

Language:Slovenian
Keywords:računalništvo, jezikoslovje, računalniške zbirke besedil, računalniški korpusi
Work type:Not categorized (r6)
Tipology:1.01 - Original Scientific Article
Organization:FF - Faculty of Arts
Year:1997
Publisher:Slavistično društvo Slovenije
Number of pages:str. 81-96
Numbering:Let. 42, št. 2/3
UDC:81:681.3
ISSN on article:0021-6933
COBISS.SI-ID:2940770 Link is opened in a new window
Views:412
Downloads:99
Metadata:XML RDF-CHPDL DC-XML DC-RDF
 
Average score:(0 votes)
Your score:Voting is allowed only to logged in users.
:
Share:AddThis
AddThis uses cookies that require your consent. Edit consent...

Record is a part of a journal

Title:Jezik in slovstvo
Shortened title:Jez. slovst.
Publisher:Slavistično društvo Slovenije
ISSN:0021-6933
COBISS.SI-ID:746756 This link opens in a new window

Secondary language

Language:English
Abstract:
Ordered and computerized text collections - corpora - are becoming and indispensable source of linguists data. Freely available corpora of the Slovene language do not exist. The article gives a historical overview of the development of computer corpora, their typologiy and fields of application. Two aspects of corpora are discussed next: the standardization of their encoding and the tools for their development and exploitation. The second partof the article gives an overview of the MULTEXT-East project (Multilingual Text Tools and Corpora for Central and Eastern European Languages), Which also includes the Slovene language. The focus of the presentation is on the corpus and morphosyntactic descriptions developed in the project and on its currently available results. Finally, some possibilities for developing the field of corpus linguistics in Slovenia are discussed.


Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Comments

Leave comment

You have to log in to leave a comment.

Comments (0)
0 - 0 / 0
 
There are no comments!

Back