Editor for this issue: Justin Fuller <justinlinguistlist.org>
LINGUIST List is hosted by Indiana University College of Arts and Sciences.
After 6 successful years, the Research Unit "Emerging Grammars in Language-Contact Situations" (https://hu.berlin/rueg) came to an end in March 2024.
The RUEG corpus that we created will remain accessible, through full open access. The corpus contains naturalistic yet systematically comparable data from language productions of a total of 774 heritage and monolingually-raised speakers:
- in English, German, Greek, Russian, and Turkish
- in formal and informal settings, spoken and written
- by multilingual and monolingual speakers, adolescents and adults
- in Germany, Greece, Russia, Turkey, and the US
Data in the corpus were collected via elicited narrations of a short video clip of a minor car accident. In addition to the basic transcription (over 550K words), the data are annotated for syntactic spans, lemmata, language, and part of speech. Subsets of the data are also annotated for specific phonological, lexical, morphosyntactic, and discourse-pragmatic phenomena.
We encourage everyone to keep using the RUEG corpus as a resource for research on language contact; language variation and change; majority and heritage language use; register differentiation; youth language; computer-mediated communication (CMC); lexicon, morphosyntax, and discourse-pragmatics; and much more.
Check out the RUEG corpus at https://hu.berlin/rueg-corpus
Linguistic Field(s): General Linguistics
Text/Corpus Linguistics
Subject Language(s): English (eng)
German (deu)
Greek, Modern (ell)
Russian (rus)
Turkish (tur)
Page Updated: 06-Apr-2024
LINGUIST List is supported by the following publishers: