Introduction to quantitative text analysis
This course provides an introduction to quantitative text analysis in the Digital Humanities, covering corpus building, text mining, stylometry, topic modelling, and machine learning approaches.
Instructor: Sarah Lang
Course Overview
The course introduces methods of quantitative text analysis and their application within the Digital Humanities. Students learn how to formulate research questions that can be investigated computationally, construct and prepare textual corpora, and interpret quantitative results in a humanities context.
Using literary texts as case studies, the course explores concepts such as corpus building, representativeness, collocations, keyword analysis, stylometry, topic modelling, and text visualisation. Particular emphasis is placed on the relationship between distant reading and close reading, highlighting how computational approaches can complement traditional literary analysis.
This course was taught at the University of Passau in Summer Semester 2019 as Atmosphäre der Angst (Gothic Horror) and again in Summer Semester 2022 as Schreibmaschinen: Anwendungen von Deep Learning und Text Mining verstehen. The 2019 iteration focused on Gothic horror literature and the question of whether the “atmosphere of fear” can be analysed quantitatively. The 2022 iteration expanded the course to include recent developments in text mining, machine learning, and deep learning, while maintaining a practical focus on the use of established text analysis tools that require no prior programming experience.