Details

Šolar, the developmental corpus of Slovene
ID Arhar Holdt, Špela (Author), ID Kosem, Iztok (Author)

.pdfPDF - Presentation file, Download (722,30 KB)
MD5: 16D72EDD153423C19B7CF6118E56D1B4
URLURL - Source URL, Visit https://link.springer.com/article/10.1007/s10579-024-09758-4 This link opens in a new window

Abstract
The paper presents the Šolar developmental corpus of Slovene, comprising the written language production of students in Slovene elementary and secondary schools, along with teacher feedback. The corpus consists of 5485 texts (1,635,407 words) and includes linguistically categorized teacher corrections, making the corpus unique in reflecting authentic classroom correction practices. The paper addresses the corpus compilation, content and format, annotation, availability, and its applicative value. While learner corpora are abundant, developmental corpora are less common. The paper bridges the gap by introducing the evolution from Šolar 1.0 to 3.0, emphasizing improvements in text collection, error and correction annotation, and categorization methodology. It also underlines the challenges and unresolved issues of compiling developmental corpora, most notably the lack of openly available tools and standards for different steps of the compilation process. Overall, the Šolar corpus offers valuable insights into language learning and teaching, contributing to teacher training, empirical studies in applied linguistics, and natural language processing tasks.

Language:English
Keywords:Šolar, developmental corpus, Slovene language, student writing, teacher feedback
Work type:Article
Typology:1.01 - Original Scientific Article
Organization:FF - Faculty of Arts
FRI - Faculty of Computer and Information Science
Publication status:Published
Publication version:Version of Record
Year:2025
Number of pages:Str. 1151-1177
Numbering:Vol. 59, iss. 2
PID:20.500.12556/RUL-169224 This link opens in a new window
UDC:004.85:81'322
ISSN on article:1574-020X
DOI:10.1007/s10579-024-09758-4 This link opens in a new window
COBISS.SI-ID:204228867 This link opens in a new window
Publication date in RUL:19.05.2025
Views:348
Downloads:69
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Record is a part of a journal

Title:Language resources and evaluation
Publisher:Springer Nature
ISSN:1574-020X
COBISS.SI-ID:224002304 This link opens in a new window

Licences

License:CC BY 4.0, Creative Commons Attribution 4.0 International
Link:http://creativecommons.org/licenses/by/4.0/
Description:This is the standard Creative Commons license that gives others maximum freedom to do what they want with the work as long as they credit the author.

Secondary language

Language:Slovenian
Keywords:Šolar, razvojni korpus, slovenščina, šolsko pisanje, učiteljski popravki

Projects

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:P6-0411
Name:Jezikovni viri in tehnologije za slovenski jezik

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:J7-3159
Name:Empirična podlaga za digitalno podprt razvoj pisne jezikovne zmožnosti

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back