The Boarnsterhim Corpus (BHC) is a corpus for researchers and students who want to study the phonetics, phonology, variation and change in spoken West Frisian. The BHC will be embedded in a larger database of Frisian corpora in order to make it possible to study the effects of lexical frequency. The first data will be published in 2018 and the corpus will be extended in the upcoming years.

Students who are interested in an internship on one of the following (or related) topics are welcome to contact dr. Marjoleine Sloos bilingualism, phonetics, sociolinguistics, phonology, reading skills, language attitudes, corpus linguistics.

Interns and volunteers who are interested in the construction of the corpus (recordings, orthographic transcriptions in Dutch and/or Frisian, phonetic transcriptions, POS tagging) are also advised to contact dr. Marjoleine Sloos


Does Frisian converge towards Dutch? That question has often been asked and some evidence seems to support that idea. To study whether the sound system of Frisian was really changing towards Dutch, The Boarnsterhim Corpus (henceforth BHC) was recorded in 1982-1984. The studies that followed from this suggest that the Frisian sound system was stable. In some respects, the distinction between Frisian and Dutch became even stronger. To further investigate whether this trend continues, the BHC2 is recorded in 2017-2019. Recordings and analyses of four generations of speech provides the opportunity to investigate the stability, variation, and change of the Frisian sound system over 100 years.

In both periods, speakers of three generations of the same families were recorded: grandmother, mother, and daughter; or grandfather, father, and grandson. The two younger generations of the first period overlap with the oldest two generations of the second period. A unique property of this corpus is that as far as possible, half of the overlapping generations in the BHC1 and the BHC2 consists of speech of the same individuals.

All speakers were recorded twice. One time they were recorded in Frisian with a native interviewer to ensure informal Frisian speech. The other time they were recorded in Dutch with a monolingual Dutch interviewer to avoid Frisian. Each recording consists of 20 read sentences, a read story (2-3 minutes), and an interview of about 40 minutes about the speaker’s use of Frisian, language attitude, and daily life activities. In the BHC1, data were recorded on cassette tapes which were digitalized in 2016. The BHC2 is a replication of the BHC1, with the same number of speakers and same age groups.

With the assistance of research assistants, interns, and volunteers, the data are annotated in Praat speech processing software. This separates the phrases, words, and sounds (with an accuracy of milliseconds). There are separate tiers (levels) for:

  • orthography
  • words
  • phonemes
  • phonetic realization
  • deletion of speech sounds
  • specific phonological processes


This corpus is highly suitable for research in the following fields

  • bilingualism and code-switching
  • long term language change
    • especially in bilingualism
    • and minority languages
  • the phonetics and phonology of Frisian
  • real-time vs. apparent time studies into language change
  • studies into the development of reading competences of Frisian
  • frequency effects in language
  • language and ageing
  • language attitude over time



Netherlandse Organisatie voor Wetenschappelijk Onderzoek NWO “The Netherlands Organisation for Scientific Research” VENI grant for Dr. Marjoleine Sloos.


Nederlandse Organisatie voor Zuiver Wetenschappelijk Onderzoek (currently Netherlandse Organisatie voor Wetenschappelijk Onderzoek NWO “The Netherlands Organisation for Scientific Research”)

Stichting Taalwetenschap

Fryske Akademy, for endowed chair Frisian

Fryslân Bank



Project leader

Dr. Marjoleine Sloos


Ir. Eduard Drenth


Dr. Wilbert Heeringa

Orthographic transcriptions Frisian

Eke Born, Truus Bremer, Kobe Flapper, Renske Hooijenga, Hilde de Jong B.A., Dik Nauta, Wytse Willem Pel, Janneke Spoelstra M.A., Tineke Tamminga, Helga Zandberg

Orthographic transcriptions Dutch

Grietje Keizer-Heeringa, Theresia Schreiber, Edmée Valk-Boon B.A., Rick Weggen

Phonetic transcriptions

Andrea Garcia Ariza M.A., Tessa Hummel B.A., Hilde de Jong, Mirte Koppenberg, Wilma Stienstra



Tony Feitsma


Els van der Geest M.A., dr. Frits van der Kuip, Irénke Meekma, M.A.


Sloos, Marjoleine, Eduard Drenth & Wilbert Heeringa (Forthcoming). The Boarnsterhim Corpus: A Bilingual Frisian-Dutch Panel and Trend Study. In Proceedings of the 11th edition of the Language Resources and Evaluation Conference, 7-12 May 2018, Miyazaki (Japan).

Feitsma, Antonia. (1989). Changes in the pronunciation of Frisian under the influence of Netherlandic. In Deprez, K. (ed.), Language and Intergroup Relations in Flanders and in the Netherlands, 181-193. Dordrecht: Foris.

Meekma, Irénke. 1989. Frouljuspraat en it lytse ferskil. Oer útspraakferoaring yn ‘e sandhi by froulju en manlju. It Beaken 51, 115-29.

Feitsma, Tony, Els van der Geest, Frits J. van der Kuip & Irénke Meekma. 1987. Variations and development in Frisian sandhi phenomena. International Journal of the Sociology of Language 64, 81-94.

van der Kuip, Frits J. 1986. Syllabisearring yn it Frysk en it Hollânsk fan Fryskpraters. Tydskrift foar Fryske Taalkunde 2, 69-92.