Critical and Data-Centric AI

I argue that a Critical Computational Humanities needs to adopt Data-Centric AI (DCAI) as a paradigm. It also needs stronger standards for good scholarly practice in the age of AI.

Recently, my research has expanded to include critical AI. This includes articulating critical concerns regarding the use of large language models in digital and computational humanities (Lang, 2026), as well as addressing issues such as carbon reporting and the environmental impact of computational research (missing reference). I have also explored the lure of plausibility in using LLMs for OCR in non-Western languages such as Arabic (Lang et al., 2026). Through this work, I am increasingly developing what I would like to call a critical computational humanities that that articulates principles for good scholarly practice in contexts that use digital, computational and AI methods. I aim to articulate a disciplinary ethics beyond compliance, providing a framework for critically engaging with the epistemic and ethical challenges posed by data-driven research.

References

2026

  1. Critical Concerns for Using LLMs in the (Computational) Humanities and Beyond
    Sarah Lang
    In Understanding Science with Large Language Models?, 2026
  2. Confabulated Transliterations? Managing the Lure of Plausibility in LLM-Detected Arabic Terms in an Early Modern Lexicon
    Sarah Lang, Jonas Müller-Laackmann, Hazem Lashen, and 1 more author
    In Critical Approaches to Automated Text Recognition, 2026
    Forthcoming