Corpus driven analysis of obsolescence of multi-word expressions in Late Modern English

  1. Ondřej Tichý

This paper explores a new methodology for extracting multi-word constructions, that were once common but has since become obsolete (Tichý, 2018), from large corpora (esp. from the Google n-grams dataset of the Google Books project).

It proceeds from a novel method for a bottom up multi-word construction extraction (Wahl & Gries, forthcoming), to the formulation of a semi-automatic procedure for identifying constructions that may have become lost or obsolete in the course of the last three centuries, from the Late Modern era to Present-day English (1700–2000). The procedure makes use of both relative frequencies to establish currency, as well collocational measures such as mutual infomation.

In the analytical part, the paper focuses on analysing select constructions of the type there needs no (as in “there needs no priest”) whose obsolescence is indicative of wider structural and typological changes, e.g. as in this case a decline of impersonal constructions.

Conditions, circumstances and consequences of the loss of such constructions are considered with a focus on the competing forms expressing similar functions that may be recognized as supplanting the old forms.