Corpora of the Research Centre on Multilingualism (SFB 538)

The following table gives an overview of all corpora which were constructed at the Research Centre on Multilingualism (SFB 538) between 1999 and 2011 and which are now hosted by the Hamburg Centre for Language Corpora (HZSK). Click on the More information tag to obtain detailed information about the size of a resource, about the procedure for getting access to it, etc.
Most corpora are EXMARaLDA corpora. To learn more about EXMARaLDA corpora we recommend that you

Corpus name
Project / Data Owner
Type

Short description

Keywords

Language(s)
Hamburg Adult Bilingual Language (HABLA)More information
E11 / Tanja Kupisch
spoken/audio/exmaralda
Audio recordings (semi-spontaneous interviews) with German/Italian and German/French bilingual speakers aged approx. 15-55 years at the recording sessions. The simultaneous bilinguals with German and French/Italian as L1s have been recorded twice, i.e. once for each language. The successive bilinguals with German as L1 and French/Italian as L2, or French/Italian as L1 and German as L1 all have AOAs between 11 and 38 years and have been recorded using their L2. L2 data
adult bilingualism
cross-sectional data
language attrition
simultaneous bilingualism
successive bilingualism
German, Standard (deu), French (fra), Italian (ita),
Hamburg Corpus of Polish in Germany (HamCoPoliG)More information
H8 / Bernhard Brehmer
spoken/audio/exmaralda
Audio recordings of German/Polish bilingual and Polish monolingual adults (16-46 years). Recordings of semi-spontaneous data (3 topics) and renarration of a picture story. L1 data
adult L2 acquisition
adult bilingualism
child L2 acquisition
cross-sectional data
language attrition
simultaneous bilingualism
successive bilingualism
Polish (pol),
Hamburg Corpus of Argentinean Spanish (HaCASpa)More information
H9 / Christoph Gabriel
spoken/audio/exmaralda
Audio and video recordings of experimental/read and spontaneous speech from adult speakers of Porteño Spanish in Argentina. Speakers are 18-69 years old and from two geographic areas. For the intonational experiments, there are audio recordings only, whereas some of the free interviews and map tasks feature video recordings. The material used as stimuli in the experiments is available with references encoded in the transcriptions. contact variety
cross-sectional data
language contact
regional variety
Spanish (spa),
Dolmetschen im Krankenhaus (DiK)More information
K2 / Kristin Bührig, Bernd Meyer
spoken/audio/exmaralda
Audio recordings of various kinds of doctor-patient communication in hospitals. There are both monolingual conversations in German, Portuguese and Turkish, recorded in the respective country, and interpreted conversations recorded in Germany (i.e. in German-Turkish, German-Portuguese, and German-Portuguese/Spanish), about 15-20 recordings of each kind. The persons interpreting are bilingual hospital employees or relatives of the patients, who are all adults living in Germany but with varying knowledge of German. communication in institutions
community interpreting
consecutive interpreting
doctor-patient communication
interpreted communication
German, Standard (deu), Portuguese (por), Turkish (tur),
Consecutive and Simultaneous Interpreting (CoSi)More information
K6 / Bernd Meyer
spoken/audio+video/exmaralda
Audio and video recordings of three lectures in Portuguese, one simultaneously and two consecutively professionally interpreted into German. For the simultaneouly interpreted lecture there are different recordings and transcriptions for the participants. consecutive interpreting
expert-laymen communication
professional interpreting
simultaneous interpreting
German, Standard (deu), Portuguese (por),
Community Interpreting Database Pilot Corpus (ComInDat)More information
ComInDat / Philipp Angermeyer, Kristin Bührig, Bernd Meyer
spoken/audio+video/exmaralda
Audio and video recordings of various types of community interpreted discourse (doctor-patient communication, simulated doctor-patient communication, courtroom communication) in German (simulated and authentic doctor-patient communication) and US (courtroom communication) institutions with varying community languages. Video recordings only exist for the simulated communication. For the authentic interpreted doctor-patient communication, no audio files will be made available. communication in institutions
community interpreting
courtroom communication
doctor-patient communication
interpreted communication
German, Standard (deu), English (eng), Haitian (hat), Polish (pol), Portuguese (por), Russian (rus), Spanish (spa), Turkish (tur),
EXMARaLDA Demo CorpusMore information
Z2 / Hamburger Zentrum für Sprachkorpora
spoken/audio+video/exmaralda
A selection of short audio and video recordings in various languages to be used for instruction or demonstration of the EXMARaLDA system German, Standard (deu), English (eng), French (fra), Italian (ita), Norwegian (nor), Polish (pol), Spanish (spa), Swedish (swe), Turkish (tur), Vietnamese (vie),
Hamburg Map Task Corpus (HAMATAC)More information
Z2 / Hamburger Zentrum für Sprachkorpora
spoken/audio/exmaralda
Audio recordings of map tasks with adult L2 users of German. The speakers' L1 and their L2 proficiencies vary. The maps used for the tasks are available. L2 data
adult L2 acquisition
adult bilingualism
learner corpus
map task
simultaneous bilingualism
successive bilingualism
task-oriented communication
German, Standard (deu),
Deutscher und Französischer doppelter Erstpracherwerb (DUFDE)More information
E2 / Jürgen Meisel
spoken/video/exmaralda
Video recordings of seven German/French simultaneous bilingual children, starting at approx. 1 year and 6 months. One or two recordings eachmonth until approx. 5 years and 6 months (and some later recordings for some subjects). In each recording session (interviewer/child interaction) the child is addressed in both languages in one French and one German part. L1 data
L2 data
child bilingualism
child language acquisition
longitudinal data
simultaneous bilingualism
German, Standard (deu), French (fra),
Bilingualer Portugiesisch-Deutscher Erstpracherwerb (BIPODE)More information
E2 / Jürgen Meisel
spoken/video/exmaralda
Video recordings of three German/Portuguese simultaneous bilingual children, starting at approx. 1 year and 6 months. One or two recordings each month until approx. 5 years and 6 months. In each recording session (interviewer/child interaction) the child is addressed in both languages in one Portuguese and one German part. L1 data
L2 data
child bilingualism
child language acquisition
longitudinal data
simultaneous bilingualism
German, Standard (deu), Portuguese (por),
Parameterfixierung im Deutschen und Spanischen (PAIDUS)More information
E3 / Conxita Lleó
spoken/audio/exmaralda
Audio recordings of five German and five Spanish speaking monolingual children. For the German children there are about 30 recordings (interviewer/child interaction) per child, on an average starting at 9 months and ending at 3 years; for the Spanish children there are on average 15 recordings per child ending at 2 years. child language acquisition
longitudinal data
monolingual data
German, Standard (deu), Spanish (spa),
PhonBLA Longitudinalstudie HamburgMore information
E3 / Conxita Lleó
spoken/audio+video/exmaralda
Audio and Video recordings of four German/Spanish bilingual children starting at approx. 1 year and 6 months and ending at age 6-7 years with about 100 recordings (interviewer/child interaction) of each child, half of them in each language. L1 data
L2 data
child bilingualism
child language acquisition
longitudinal data
simultaneous bilingualism
German, Standard (deu), Spanish (spa),
Phonologie-Erwerb Deutsch-Spanisch als Erste Sprachen (PEDSES)More information
E3 / Conxita Lleó
spoken/audio/exmaralda
Audio recordings of three German/Spanish simultaneous bilingual children starting at approx. 1 year and ending at 2 or 3 years. There are 20-50 recording sessions (interviewer/child interaction) per child, half of them conducted in German and half in Spanish. L1 data
L1-Daten
L2 data
L2-Daten
child bilingualism
child language acquisition
longitudinal data
simultaneous bilingualism
German, Standard (deu), Spanish (spa),
Phon-CL2More information
E3 / Conxita Lleó
spoken/audio/exmaralda
Audio recordings of 15 German subjects in Spain (5 to 36 years old) with Spanish as L2 and AOA > 2 years. Recording sessions in Spanish based on picture naming and story telling etc. Rich metadata on language use and attitudes in the family submitted by the parents. L2 data
adult bilingualism
child L2 acquistion
child bilingualism
child language acquisition
successive bilingualism
German, Standard (deu), Spanish (spa),
Catalan in a bilingual context (PhonCAT)More information
H6 / Conxita Lleó
spoken/audio/exmaralda
Audio recordings of prompted, read and spontaneous speech data from L1 Catalan speakers from Barcelona. The data is stratified according to three different city districts and three age groups. Speakers' age vary from approx. 5 to 45 years. L1 data
adult bilingualism
bilingual society
child L2 acquisition
child bilingualism
cross-sectional data
language contact
simultaneous bilingualism
successive bilingualism
Catalan-Valencian-Balear (cat),
Skandinavische Semikommunikation (SkandSemiko)More information
K5 / Kurt Braunmüller
spoken/audio/exmaralda
Audio recordings of Scandinavian speakers interacting using their respective languages in various contexts: Bilingual radio broadcasts (mainly from the Öresund area), disscussions from the meeting of a Nordic organisation, classroom discourse from a Scandinavian school in Germany, two semi-spontaneous conversations between Danish and Norwegian teenagers, and a Danish course for Swedish students in southern Sweden. Most speakers have Danish, Norwegian or Swedish as L1 and varying receptive knowledge of the other languages, whereas e.g. Icelandic speakers use a Scandinavian language as a foreign language. adult language acquisition
child bilingualism
recepteive multilingualism
semi-communication
youth language
Danish (dan), Norwegian (nor), Swedish (swe),
ALCEBLAMore information
E3 / Conxita Lleó
spoken/audio/exmaralda
Audio recordings in Spanish with 23 German/Spanish simultaneous bilingual children living in Germany and attending the Spanish complementary school at the first level. 1-6 recordings with each child, with 11 children also before the children attended the Spanish complementary school. All recordings feature elicited speech: A picture naming task, a story telling task, a morphosyntactic test, a lexical test, and the HAVAS 5. Rich metadata on language use and attitudes in the family submitted by the parents. child bilingualism
child language acquisition
simultaneous bilingualism
German, Standard (deu), Spanish (spa),
Simuliertes Dolmetschen im Krankenhaus (SimDik)More information
T5 / Kristin Bührig, Bernd Meyer
spoken/audio+video/exmaralda
Audio and video recordings of simulated interpreted doctor-patient communication in Polish, Romanian or Russian and German. The interpreters are bilingual employees at hospitals participating in an interpreting training program developed in the project and based on DiK data. communication in institutions
community interpreting
consecutive interpreting
doctor-patient communication
expert-laymen communication
interpreted communication
German, Standard (deu), Polish (pol), Romanian (ron), Russian (rus),
CHILD-L2More information
E2 / Jürgen Meisel
spoken/video/exmaralda
Video recordings of French and German children that start acquiring German or French as an L2 with varying AOAs, mainly of approx. 3 years. Some families use additional languages, mainly English. The child is addressed in the L2 in the recording sessions, which are of the type interviewer/child interaction. On an average, for each child there is data from two occasions each year during two years. L2 data
child L2 acquisition
child bilingualism
child language acquisition
longitudinal data
successive bilingualism
German, Standard (deu), French (fra),
Zweitspracherwerb Italienischer und Spanischer Arbeiter (ZISA)More information
E2 / Jürgen Meisel
spoken/audio/exmaralda
Audio recordings of five adult learners of German as an L2 with L1s Spanish, Italian and Portuguese. Recording sessions (interview/conversation) in German once or twice a month over approx. two years starting 3-14 weeks after their arrival in Germany. L2 data
adult L2 acquisition
adult bilingualism
learner Corpus
longitudinal data
successive bilingualism
German, Standard (deu),
Baskischer und Spanischer doppelter Erstspracherwerb (BUSDE)More information
E2 / Jürgen Meisel
spoken/video/other
Video recordings of Basque/Spanish bilingual children starting at approx. 1 year and 6 months. One or two recordings (interviewer/child interaction) each month until approx. 5 years. child bilingualism
child language acquisition
longitudinal data
Basque (eus), Spanish (spa),
PhonBLA Querschnittsstudie MadridMore information
E3 / Conxita Lleó
spoken/audio+video/exmaralda
Video recordings of 71 Spanish/German simultaneous bilingual children living in Madrid (Spain). Recordings (interviewer/child interaction) in German and in Spanish for most children. L1 data
L2 data
child bilingualism
child language acquisition
cross-sectional data
simultaneous bilingualism
German, Standard (deu), Spanish (spa),
PhonMASMore information
E3 / Conxita Lleó
spoken/audio/exmaralda
Audio recordings (interviewer/child interaction) of Spanish monolingual children aged between 2 and 6 years. The recordings were made in five different years, 11 children were recorded in two different years. Comparable data for Madrid-PhonBLA. child language acquisition
cross-sectional data
monolingual data
Spanish (spa),
TÜ_DE-cL2-KorpusMore information
E4 / Monika Rothweiler
spoken/video/exmaralda
Video recordings in German of eight bilingual children with L1 Turkish and L2 German with AOA of 3-4 years. Several recordings of spontaneous speech (play) during 7-28 months at ages approx. 3-6,5 years, and of elicited language with focus on article usage. Comparable data for the TÜ_DE-L1-Korpus. L2 data
child L2 acquisition
child bilingualism
child language acquisition
longitudinal data
successive bilingualism
German, Standard (deu),
TÜ_DE-L1-KorpusMore information
E4 / Monika Rothweiler
spoken/audio/exmaralda
Audio recordings (spontaneous and elicited language) in Turkish with twelve bilingual children with L1 Turkish and L2 German with AOA of 3-4 years. Comparable data for the TÜ_DE-cL2-Korpus. L1 data
child L2 acquisition
child bilingualism
child language acquisition
successive bilingualism
Turkish (tur),
Rehbein-ENDFAS/Rehbein-SKOBI-KorpusMore information
E5 / Jochen Rehbein
spoken/audio/exmaralda
Audio recordings of evocative field experiments (picture story, retelling, spontaneous discourse etc.) with Turkish/German bilingual children and monolingual Turkish / monolingual German children as control data. L1 data
L2 data
child bilingualism
child language acquisition
contact variety
monolingual data
German, Standard (deu), Turkish (tur),
ENDFAS/SKOBI Gold StandardMore information
E5 / Jochen Rehbein
spoken/audio/exmaralda
Audio recordings in Turkish and/or German of one Turkish (8 years) and one German monolingual (7 years) and one German/Turkish bilingual child (5 years). Small demo excerpt from the Rehbein-ENDFAS/Rehbein-SKOBI-Korpus. L1 data
L2 data
child bilingualism
child language acquisition
contact variety
monolingual data
German, Standard (deu), Turkish (tur),
Faroese Danish Corpus Hamburg (FADAC Hamburg)More information
K8 / Kurt Braunmüller
spoken/audio/exmaralda
Audio recordings of semi-structured interviews with bilingual speakers (aged 16-89 years) from various geographical areas on the Faroe Islands. For 37 of the 56 subjects there are recordings in both their L1 Faroese and their L2 Danish. L1 data
L2 data
adult bilingualism
bilingual society
contact variety
cross-sectional data
language contact
semi-structured interviews
successive bilingualism
Danish (dan), Faroese (fao),
Hamburg Corpus of Old Swedish with Syntactic Annotations (HaCOSSA)More information
H3 / Kurt Braunmüller
written/tei
Religious and secular prose, law texts, non-fiction literature (geographical, theological, historic, natural science), diploma. historical texts
translated texts
Old Swedish (swe),
Covert translation: popular scienceMore information
K4 / Juliane House
written/tei
Translation corpora of original texts with translations and comparable texts from the genre popular scientific prose comparable corpus
parallel corpus
popular science texts
translated texts
German, Standard (deu), English (eng),
Covert Translation: business communication (old)More information
K4 / Juliane House
written/tei
Translation corpora of original texts with translations and comparable texts from the genre external business communication business communication
comparable corpus
parallel corpus
popular science texts
translated texts
German, Standard (deu), English (eng),
Covert Translation: business communication (new)More information
K4 / Juliane House
written/tei
Translation corpora of original texts with translations and comparable texts from the genre external business communication business communication
comparable corpus
parallel corpus
popular science texts
translated texts
German, Standard (deu), English (eng),