AppsApps

Quantifying the social dimensions of word meaning: behavioural data from thousands of Czech speakers

Date
Speaker
  1. James Brand
  2. Mikuláš Preininger
  3. Adam Kříž
  4. Markéta Ceháková
Abstract

The ability to represent the meanings of thousands of words is a uniquely human trait. One key challenge for researchers has been to quantitatively measure the meaning of those words, so we can better understand how meaning is represented by the people who learn, use and process their language or languages. This line of research has relied on a multidisciplinary approach, utilising work from corpus, computational and psycholinguistics. In this talk, we will present work from the SocioLex project, which aims to quantify the meaning of Czech words along 5 semantic dimensions of meaning, which focus on socially encoded information, i.e. how the word relates to age, gender, location, politics and valence. To do this, we asked a large and diverse sample of Czech speakers (aged 18-25, 35-45 and 60+), to provide ratings for 2,700 words (adjectives, nouns and verbs), in terms of how their meanings relate to each of the dimensions, e.g. is the word related to young/old, femininity/masculinity, rural/urban, liberal/conservative, negative/positive meaning. We also collected data for each of the words related to semantic category membership, e.g. which superordinate category does the word belong to. Additionally, we are currently collecting data from Czech speakers for same word list, but for their English translation equivalents. We hope that this will provide the first large scale quantification of how specific aspects of meaning are represented by Czech speakers.