Shape
Shape
Shape
Shape
Shape
Shape
Shape
Shape
Shape
Shape

Project

CHYLSA

Children's and Youth Literature Sentiment Analysis
2023–2026

Learn more
Children reading a book

About CHYLSA

Advanced sentiment analysis for understanding affective-aesthetic responses to literary texts: A computational and experimental psychology approach to children’s literature
Phase 1: 2020–2023 Phase 2: 2023–2026

We are researchers based in Mainz (Johannes Gutenberg-Universität) and Berlin (Freie Universität) from the areas of Book Studies, Digital Humanities, Psychology and Literary Studies who share the common interest of investigating emotional reactions to literary texts. CHYLSA is part of the priority program SPP Computational Literary Studies (CLS) funded by the German Research Foundation (DFG).


CHYLSA project description FU Berlin

CHYLSA project description JGU Mainz

CHYLSA Team

Berlin

Mainz

Collaborators

Research

Emotional involvement is of pivotal importance when children learn to read, tell, and share stories. This crucial dimension of cultural literacy has received surprisingly little attention within literary studies, psychology, and digital humanities. Taking a large-scale and data-driven approach, we develop and validate sentiment analysis methods for computational literary studies combining knowledge from literary studies as well as neurocognitive psychology.

Corpora

We collect data and build corpora on German children's literature: novels, fairy tales, poetry and more.

Sentiment Analysis

We are developing and applying methods for Sentiment Analysis of literary texts.

Empirical Studies

We conduct empirical studies with children investigating their emotional reactions to literature.

Corpora

childPoeDE

The childPoeDE corpus is a collection of 1082 German children's poems. The poems were taken from anthologies published between 1991 and 2019. Corpus metadata includes information about the author, the poem's length, data on case, punctuation, layout, rhyme, type-token ratio (TTR and MATTR) and lexical density. It also provides token-level metadata, namely word length and position, POS tags in different levels of granularity as well as data on onomatopoeia and sonority. For more information see our publication of the childPoeDE metadata on Zenodo and the data paper on childPoeDE in the Journal of Open Humanities Data.

childTale-A

Within childTale we annotated textually encoded emotions in a core set of the Grimms’ Children's and Household Tales. In total we manually annotated 80 fairy tales with more than 5,000 sentences on the two dimensions valence and arousal as well as on the occurrence of the six basic emotions according to Ekmann. Based on these annotations we introduce four aggregated measures for the analysis of textually encoded emotions: Average Valence, Emotional Potential, Emotional Arc, and Emotion Profile. We used these measures to analyze the childTale-A corpus with regard to the purported cruelty vs. optimism of fairy tales. On average, the fairy tales contain more than 50% emotional sentences without clear negative sentiment, while emotion trajectory patterns vary. The findings underscore the role of emotions as plot-driving elements in fairy tales as a highly schematized historical genre. For more information refer to our publication "A Fairy Tale Gold Standard" in the Zeitschrift für Digitale Geisteswissenschaften and to the childTale-A data publication on Zenodo. Currently, we collect additional data to compare the textually encoded emotions with reader responses.

4Books

4Books is an annotated corpus of four children's and young adult books, namely "Oma!", schreit der Frieder by Gudrun Mebs, Jim Knopf und Lukas der Lokomotivführer by Michael Ende, Das Schicksal ist ein mieser Verräter by John Green and Harry Potter und der Halbblutprinz by Joanne K. Rowling. In the 4books annotation study, 20 adults entirely read one of four children and youth books, evaluating the emotional impact of each sentence on valence and arousal. Additional ratings were collected after each chapter and the end of the book. For more information see our contribution to the IGEL conference:
Lüdtke, J. & Jacobs, A. M. (2023). On a rollercoaster with Frieder, Jim, Hazel and Harry: Identifying emotional arcs in reader responses to children and youth books. Talk at 19th IGEL conference, Monopoli, September 2023 (IGEL website).

childLex

For our research we also work with the childLex corpus created by Sascha Schröder et al. For more information:

Schroeder, S., Würzner, KM., Heister, J. et al. childLex: a lexical database of German read by children. Behav Res 47, 1085–1094 (2015). DOI: https://doi.org/10.3758/s13428-014-0528-1.

Sentiment Analysis

Investigating emotions in children's and young adult literature is at the core of our project. To this end we use empirical studies as well as data-driven approaches. For the latter we apply and compare different approaches to Sentiment and Emotion Analysis. For Sentiment Analysis we use dictionary-based approaches like SentiArt developed by Arthur Jacobs as well as transformer-based approaches (BERT models). We are currently also exploring the potential of LLMs (Large Language Models) like GPT in this field. Traditionally, psychological concepts of sentiment and emotion have not played a role in computational approaches. The CHYLSA project tries to fill this gap on the one hand by integrating psychological knowledge in the process of Sentiment Analysis and on the other hand by validating results in experimental studies, contrasting data-based results with human evaluations.

Empirical Studies

In our empirical studies we investigate how children experience stories and poems. We invite 5–12 year olds to read or listen to stories and poems that we draw from our corpora. These are carefully selected pieces of text that follow specific criteria, which is necessary to collect well-controlled data. We are particularly interested in the perception of the presented piece of literature in terms of valence (positive and negative feelings), liking, and immersion. Furthermore, we research the development of emotional vocabulary, which undergoes amazing progress throughout childhood. We concluded a study on the perception of stories with and without poetic justice in preschoolers. Currently, we are running a study on poems and are looking for 9-year-old german speaking participants who enjoy listening to a few poems and share their experience with us.

Want to participate in our study?

For our Toya study we are currently looking for 9-year-old children with German as mother tongue.
Reward: 20€ book voucher.

flyer for the toya study

Publications

Journal Articles

  • Herrmann, B. & Lüdtke, J. (2023). A Fairy Tale Gold Standard. Annotation and Analysis of Emotions in the Children's and Household Tales by the Brothers Grimm. Zeitschrift für digitale Geisteswissenschaften (ZfdG). DOI: https://doi.org/10.17175/2023_005.
  • Jacobs, A. M., Herrmann, B., Lauer, G., Lüdtke, J., & Schroeder, S. (2020). Sentiment Analysis of Children and Youth Literature: Is There a Pollyanna Effect? Frontiers in Psychology, 11. DOI: https://doi.org/10.3389/fpsyg.2020.574746.
  • Lehmann, M., Heumann, A., Kuijpers, M. M., Lauer, G. & Lüdtke, J. (2023). The ChildPoeDE Corpus: 1082 German Children’s Poems for Computational and Experimental Studies on Poetry Reception. Journal of Open Humanities Data. DOI: https://doi.org/10.5334/johd.102.
  • Luther, L. (2023). Positive Emotional Reaction to Poetic Justice already established at a Preschool Age. (Publication in progress.)

Conference Contributions

  • Lüdtke, J. & Jacobs, A. M. (2023). On a rollercoaster with Frieder, Jim, Hazel and Harry: Identifying emotional arcs in reader responses to children and youth books. Talk at 19th IGEL conference, Monopoli, September 2023 (IGEL website).
  • Rebora, S., Lehmann, M. & Heumann, A. (2023). Sentiment Analysis of German children's and young adult fiction. Can dictionary-based approaches keep up with Transformer-based models? Talk at 19th IGEL conference, Monopoli, September 2023 (IGEL website).
  • Rebora, S., Lehmann, M., Heumann, A., Ding, W. & Lauer, G. (2023). Comparing ChatGPT to Human Raters and Sentiment Analysis Tools for German Children's Literature, Proceedings of the Computational Humanities Research Conference, Paris, December 2023 (CEUR Vol-3558 ).
  • Data Publications

  • Herrmann, B. & Lüdtke, J. (2023). childTale-A: A corpus of eighty fairy tales from the 7th edition by the Brothers Grimm, manually annotated for textually encoded emotions. DOI: https://doi.org/10.5281/zenodo.7737329.
  • Lehmann, M., Heumann, A., Kuijpers, M. M., Lauer, G. & Lüdtke, J. (2023). childPoeDE: A corpus of German Children's Poems for Computational and Experimental Studies - Metadata. DOI: https://doi.org/10.5281/zenodo.7936860.

Contact

Address - Mainz

JGU Mainz
Department of Book Studies
Jakob-Welder-Weg 18
55128 Mainz

Address - Berlin

FU Berlin
Department of Experimental and Cognitive Neuropsychology Habelschwerdter Allee 45
14195 Berlin

Email

For E-mail addresses of individual team members refer to CHYLSA Team