Machine Learning for Digital Scholarly Editions

This course explores the application of machine learning methods, including natural language processing, computer vision, and OCR, in the context of digital scholarly editions.

Instructor: Martina Scholger, Sarah Lang

Course Overview

The course introduces students to machine learning methods relevant to the creation and analysis of digital scholarly editions. It covers both classical approaches, such as regression, decision trees, and support vector machines, and contemporary deep learning methods based on neural networks and transformer architectures.

Particular attention is given to applications in natural language processing, computer vision, OCR, and handwritten text recognition. Through practical work with humanities data, students learn how machine learning techniques can support research in fields such as history, literary studies, art history, and cultural heritage. The course also addresses the opportunities and limitations of AI in humanities research, including questions of bias, transparency, and responsible use.

This course was taught at the University of Graz in Summer Semester 2024 and Winter Semester 2024/2025 together with Martina Scholger.