Placeholder Image

Introduction

The ERC-funded ERASMOS project at KU Leuven, co-ordinated by prof. Raf Van Rooy and investigating bilingualism in Renaissance texts of the "long 15th century" (1397 - 1536), presents itself with a workshop, and invites scholars to showcase the state of the art of computationally oriented research on language- and code-mixing in documents ranging from Antiquity to Early Modern times!

The workshop took place on November 27th (14:00 - 18:00) and 28th (9:30 - 13:00) at KU Leuven in hybrid format. The workshop was associated with the Days of computational approaches to Ancient Greek and Latin, including a common social dinner on Thursday, November 28th (participation is possible for non-presenters by contributing a fee).

Slides

The slide of the choral ERASMOS+ team presentation are available on zenodo.

Program of the workshop

Day 1 - Wednesday, November 27th

Room: MSI1 00.20 (and online)

  • 14:00 - 15:10: Introduction

    Collective presentation of ERASMOS featuring BREPOLS

    1. Introduction to the workshop and the ERASMOS+ project / Raf Van Rooy (14:00 - 14:15)
    2. The Erasmian part / Manou Vermeire & Mariia Timoshchuk (14:15 - 14:20)
    3. The philological part / Alessandro Bonvini & Isabelle Maes (14:20 - 14:25)
    4. The computational part / Flavio Massimiliano Cecchini (14:25 - 14:35)
    5. BREPOLS, ERASMOS and Erasmus / Yannick Anné & Tim Denecker (14:35 - 14:55)
    6. Time for discussion (14:55 - 15:10)

  • 15:10 - 15:40: Coffee break

  • 15:40 - 16:50: First session > Questions of style / Chair: Isabelle Maes

    1. Pieter Beullens & Benjamin Nagy with "Computational Stylometry and the Latin Sextus Empiricus"

    2. Tim Van De Cruys with "Multilingual NLP"

  • 16:50 - 18:00: Second session > Tokens and words / Chair: Flavio Massimiliano Cecchini

    1. Wouter Mercelis with "Word-level language identification"

    2. Federica Gamba with "Integrating Greek in Renaissance Latin: A Universal Dependency Approach to Sabellicus' De Latinae Linguae Reparatione" (from remote)

  • From 18:00 onwards: social drink at Carlisse 🥂

Day 2 - Thursday. November 28th

Room: MSI1 02.28 (and online)

  • 9:30 - 10:40: First session > Renaissance code-switching / Chair: Raf Van Rooy

    1. Martin Volk with "Code-Switching: Profiling and Classifying the Language Mix in 16th-Century Letters"

    2. Stefan Weise & Jennifer Bunselmeier with "Rhodomanologia - Measuring the Work of a Multilingual Renaissance Poet between Philology and Digital Humanities"

  • 10:40 - 11:10: Coffee break

  • 11:10 - 12:20: Second session > Putting resources together / Chair: Alessandro Bonvini

    1. Francesco Mambrini with "LiLa – Linking Latin: Semantic Web technologies for Latin linguistic resources"

    2. Margherita Fantoli & Alek Keersmaekers with "Translation alignment for mathematical writings"

  • 12:20 - 13:00: Final discussion and remarks; greetings

    End of the workshop

  • In the evening: joint social dinner at Gloria 🍽️

Abstracts

  • Yannick Anné & Tim Denecker

    BREPOLS (Belgium)

    BREPOLS, ERASMOS and Erasmus

    The Centre 'Traditio Litterarum Occidentalium' (CTLO) is Brepols' in-house Digital Classics lab, developing and producing the full-text and dictionary databases available on www.brepolis.net. Recently, the CTLO has partnered with the ERASMOS project to digitize the works of Erasmus. During this presentation, we will detail how this collaboration has been conceived, and how the data exchanged between Brepols and the ERASMOS project will be integrated into the broader context of Latin literature through the Library of Latin Texts (LLT). Additionally, we will demonstrate how the digital version of Erasmus’ oeuvre will be enriched by the forthcoming upgrade of the LLT’s (meta)data structures and by the new connections with the Brepolis Latin Lemmas, as featured in the Database of Latin Dictionaries.

  • Pieter Beullens & Benjamin Nagy

    KU Leuven (Belgium) & Universiteit Antwerpen (Belgium)

    Computational Stylometry and the Latin Sextus Empiricus

    The study of translators of Greek philosophical and technical texts from the 12th and 13th centuries has a long tradition of using stylometric features for the identification of individual translators. Through a meticulous comparison of the use of function words and other semantic and grammatical elements in the Latin versions with their Greek counterparts, scholars like Minio-Paluello and Bossier have been able to determine the individual stylistic ‘fingerprint’ of different translators. The identification of these individuals provided valuable insight into the context in which the process of cultural transfer took place.

    In our presentation, we aim to evaluate the established attributions through the application of computational stylometric techniques (i.e., the quantitative study of writing style), ultimately comparing two anonymously transmitted Latin translations to the works of Sextus Empiricus with others by known translators. We want to investigate whether these anonymous texts were produced by one or two translators, and whether the translations can be assigned to one of the known translators.

    Our initial methodology involved the establishment of a reference corpus of texts, comprised of works whose translators are either explicitly identified in the manuscripts preserving them or for whom the attribution is widely accepted. The selected texts were translated by six prominent scholars from the 12th century (Burgundio of Pisa, James of Venice, Henricus Aristippus, and an unnamed translator) and the 13th century (William of Moerbeke, and Bartholomew of Messina). The treatises encompass a wide range of topics, including philosophy, medicine, astrology, and theology, allowing us to assess the potential impact of genre characteristics on the translations. Moreover, the corpus includes instances of Latin versions of the same Greek texts made by different translators, offering an opportunity to precisely evaluate the salient features that display variation, even when utilizing the same source text. Such a comparison allows for a clearer differentiation between the translator’s personal style and the characteristics inherent to the source text.

    Our study introduces two specific stylometric methods for analysing translation style. The first method uses a pre-determined list of function words that have been found to be indicative of Greek-to-Latin translation style in prior philological research. We conduct a statistically informed study to determine to what extent individual translators can be differentiated from each other using only this list of words. The second method, using a mathematical approach from game theory (‘Shapley Values’), is employed to identify the most distinctive features for each author. This information can be used to gain interpretable insight into the characteristics that define the translation style of individual authors.

  • Tim Van De Cruys

    KU Leuven (Belgium)

    Multilingual NLP

    An overview of the basics and the challenges of large language models applied to multilingual Natural Language Processing.

  • Wouter Mercelis

    KU Leuven (Belgium)

    Word-level language identification

    This talk introduces an experimental approach in word-level language identification, a necessary task for automatic detection of code-switching, as this is not limited to sentences. The experiment investigates whether it is possible to train a word-level language identification classifier in a very low-resource setting, since annotated data, especially at the word-level, are scarce. Furthermore, the talk will cover if traditional sub-word classifiers suffice in the low-resource setting, rather than relying on character-based models. For the pretrained language model at the base of these tasks, a trilingual English-Latin-Ancient Greek model is used. This enables experiments not only regarding the detection of modern and classical languages, but also experiments with languages in different alphabets (e.g. Ancient Greek in the Greek alphabet and in the Latin alphabet). The goal is to further improve automatic code-switching detection in a humanist context, by finetuning it to the word level, rather than the sentence level.

  • Federica Gamba

    ÚFAL - Karlova Univerzita (Czech Republic)

    Integrating Greek in Renaissance Latin: A Universal Dependency Approach to Sabellicus' De Latinae Linguae Reparatione

    This talk explores the annotation of the 15th-century treatise De Latinae Linguae Reparatione by Sabellicus using the Universal Dependencies (UD) formalism. The text provides a glimpse into the Latin-Greek bilingualism typical of the Renaissance, prompting a discussion on how Greek words embedded in a predominantly Latin text should be annotated in the treebank. We will thus discuss the creation of the treebank, focusing on how to preserve the syntactic integrity of the Latin text while smoothly integrating the Greek elements, and considering how similar challenges have been addressed in other Latin treebanks. Sabellicus’s work thus highlights the complexities of annotating Renaissance multilingual texts, where even his relatively modest use of Latin-Greek interaction reflects the broader cultural and linguistic exchanges of the period.

  • Martin Volk

    Universität Zürich (Switzerland)

    Code-Switching: Profiling and Classifying the Language Mix in 16th-Century Letters

    We describe our efforts to profile and classify code-switching between Early New High German and Latin in a large corpus of 16th-century letters. The Early Modern period was a time of thriving multilingualism, and code-switching was widespread in written and spoken language. To understand the diverse use of code-switching in the letter collection of the Swiss reformer Heinrich Bullinger (1504-1575), we employ a combination of computational tools and manual analysis. We present our method for automated language identification and code-switching-profiling. We focus on a group of correspondents with a high degree of code-switching to analyse individual author profiles and distribution patterns.

  • Stefan Weise & Jennifer Bunselmeier

    Bergische Universität Wuppertal (Germany)

    Rhodomanologia - Measuring the Work of a Multilingual Renaissance Poet between Philology and Digital Humanities

    Laurentius Rhodoman (1545–1606) was a German humanist poet, teacher and Hellenist. He left an important oeuvre of Greek and Latin poetry as well as some minor poems in German behind. The first half of his poetic output written before 1589 is now explored in the research project “Rhodomanologia” aiming at a complete digital edition. One of the advantages of the digital edition is its high potential of statistical analysis and linking with other data. In this paper we present some of the current possibilities and future perspectives of this edition as well as their use for philological interpretation. With regard to the workshop’s focus on code-switching, we will especially concentrate on Rhodoman’s bilingual poetry.

  • Francesco Mambrini

    Università Cattolica del Sacro Cuore di Milano (Italy)

    LiLa – Linking Latin: Semantic Web technologies for Latin linguistic resources

    In this presentation, we provide a brief overview of the aims, methods, and achievements of LiLa – Linking Latin. LiLa aims to create a robust and interoperable platform for Latin linguistic resources using the principles of Linked Open Data (LOD). Its core mission is to bridge the gap between the different linguistic and philological resources available for Latin, enabling interoperability between datasets and improving their accessibility. The central feature of the LiLa Knowledge Base is its “Lemma Bank,” a collection of over 200,000 Latin lemmas. This lexically focused database forms the backbone of LiLa, ensuring that different linguistic resources, such as corpora and dictionaries, are interlinked through shared canonical forms. We consider the NLP task of lemmatization as the core practice that ensures the interconnection between different types of resources (corpora, lexicons, and software). Currently, the lemmas are used to link 14 lexicons with more than 166,000 lexical entries and over 500 texts, totaling 1,048,576 tokens. All the entries in these resources are described according to the most widespread standards of Linguistic Linked Open Data. Finally, LiLa provides services that allow users to access this wealth of interconnected information and create new interoperable annotations on texts.

  • Margherita Fantoli & Alek Keersmaekers

    KU Leuven (Belgium)

    Translation alignment for mathematical writings

    The writings of Archimedes have been a fundamental source for scientists and mathematicians across centuries. While the Greek versions had limited circulation in the western world, translations played a crucial role in disseminating these mathematical discoveries. Notably, Jacopo da San Cassiano’s translations around 1450, likely commissioned by Pope Nicholas V, played a key role in the rediscovery of Archimedes’ works among Humanists and were instrumental in the editio princeps of the Greek texts in 1544.

    In this talk, we will utilize NLP tools to align and compare the Greek original texts with the Medieval and 15th-century translations by William of Moerbeke and Jacopo da San Cassiano. Our goal is to evaluate whether automatic alignment can aid in the study of (Neo)-Latin translations of Ancient Greek texts and to compare the translation choices of different authors. Additionally, we will analyze the texts on a syntactical level using a small set of syntactically annotated sentences.

    This study addresses the unique challenges posed by the adaptation of Latin to translate Greek mathematical texts, which are written entirely in natural language rather than symbolic notation.


Co-funded by the European Union (ERC, ERASMOS, 101116087). Views and opinions expressed are, however, those of the authors and presenters only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.