This paper presents a corpus of Slovene tweets and the analysis of non-standard Slovene as used on the Twitter social network. The corpus, which comprises tweets from the first four years of Twitterʼs existence, contains 360,000 tweets or 5 million tokens. The Slovene used in the analysed tweets issubstantially different from the balanced corpus of standard Slovene ccKRES.The distinguishing features of ŽTwitter SloveneŽ are a more colloquial,phonetic orthography, frequent use of spoken language elements and an abundance of foreign words.
|