The paper presents a collection of language resources from the period 1845–1918 that
consists of the IMP resources for historical Slovene written in the Gaj alphabet. These resources contain three corpora (annotated by hand, semi-automatically and automatically) and a lexicon built from the hand-annotated words in the corpora. The corpora from the chosen period are interesting as a source of enrichment of the Dictionary of Standard Slovene, and for the study of differences in orthography from today’s standard: at the beginning of the observed period the spelling was still quite different from the one used today but by its end the differences in spelling had practically disappeared.
|