Alchemical dictionaries and early modern lexicography

This project investigates the knowledge representation practices in early modern dictionaries through multiple computational approaches.

The project employs multiple computational methods and builds on a dataset of alchemical dictionaries that has been published as a data paper (Lang, 2025). Through a series of case studies, it investigates the knowledge representation practices in early modern dictionaries. One case study investigates early modern lexicography and related knowledge-organisation resources by examining how Arabic knowledge and Arabic terminology are represented within them (Lang et al., 2026). It seeks to contribute to a more global historiography of alchemy by exploring the inclusion of non-Western forms of knowledge in early modern systems of knowledge organisation. Another study uses word embeddings to trace the potential origins of terms that Ruland lists as headwords in his dictionary (Kaše & Lang, 2026). Drawing on a representative corpus of books from the preceding century that may have served as sources, the study aims to generate new insights into the compilation of the dictionary, a process about which very little is known. Another article critically reflects on the lure of plausibility when using large language models to investigate terms in a non-Western language, namely Arabic (Lang et al., 2026). It examines both the opportunities and limitations of these methods and highlights the risks associated with relying on technologies that are not be equally robust across different linguistic traditions, especially non-Western languages.

References

2026

  1. Mediating Alchemical Language across Terminologies and Cultures in Ruland’s Lexicon Alchemiae: A Data-Driven Study of Arabic Terms
    Sarah Lang, Farzad Mahootian, and Hazem Lashen
    Ambix, 2026
    Forthcoming
  2. Contextual Word Embeddings for Paracelsian Lexicography: Tangled Terminologies and Their Origins in Ruland’s Alchemical Dictionary
    Vojtěch Kaše and Sarah Lang
    Ambix, 2026
    Forthcoming
  3. Confabulated Transliterations? Managing the Lure of Plausibility in LLM-Detected Arabic Terms in an Early Modern Lexicon
    Sarah Lang, Jonas Müller-Laackmann, Hazem Lashen, and 1 more author
    In Critical Approaches to Automated Text Recognition, 2026
    Forthcoming

2025

  1. Towards a Data-Driven History of Lexicography: Two Alchemical Dictionaries in TEI-XML
    Sarah Lang
    Journal of Open Humanities Data, 2025