In building reference corpora, we strive to be as representative and balanced as possible, by selecting as many different textual genres in proportions that represent language reality. Particularly when constructing a spoken corpus, the selection of texts is one of the thorniest issues, as we cannot at any one time objectively survey the whole of spoken production (or reception) and determine the quantitative relations between different kinds of spoken text. Taxonomies of texts can be approached in different ways - with regard to their strukctural characteristic, content, purpose, and/or the situation in which the text arose - depending primarly on the importance of the corpus, on how demanding is its constructon and on other circumstances, such as financial resources, the size of the team involved and technical capabilities.
|