Abstract - Combining Digital Fragments of Medieval Manuscripts for Creating Scribal Profiles

Bristol Conference, 16 September 2021 Niccolò N. Cappelletto, Estelle Guéville and David J. Wrisley

Keywords: Handwritten Text Recognition (HTR); medieval Latin Bibles; Paris Bibles; pasts of digitization; digital humanities; fragmentology


We do not typically think about past eras of digitization as the agent of manuscript fragmentation. We also don’t think of digitizers as physically breaking manuscripts, dispersing them in the world, but rather as creators of “digital fragments” (Burrows, 2017).

Yet, database initiatives have indeed collected digital images of manuscripts in various states of fragmentation. Some of these initiatives include the database for illuminations “Mandragore” by the Bibliothèque nationale de France ; the French Virtual Library of Medieval Manuscripts “BVMM”; the collection of manuscripts “Digital Scriptorium” in the United States ; the online database of the “Index of Medieval Art” at Princeton University in the US ; and the Dutch digital repository of manuscripts “Middeleeuwse Verluchte Handschriften”.

This phenomenon of digital fragmentation is most likely one related to pragmatic, infrastructural issues in early stages of digitization, an era which has been called the “digital past” (Salmi, 2021). During this moment of the early twentieth century specific pages of manuscripts were deemed interesting enough on account of illumination, color or other outstanding aspects to justify the costs and labor of imaging (Mouren, 2012). To be fair, many libraries of the world have also digitized full manuscripts and made them available for wider audiences, but at a global scale, there are many inequalities.

The Paris Bible Project uses these fragmentary pasts of digitization to its advantage. Seeking to understand the production and diffusion of medieval Latin Bibles in Europe, we have turned to handwritten text recognition (HTR) technologies using machine learning for automatic transcription. “Paris” Bibles are known for their supposed uniformity, but in our research, we have identified numerous ways in which such documents are full of variance (word order, interpolations, orthography, abbreviations, etc) (Guéville and Wrisley, 2020). The end goal of our project is to use this evidence from digital manuscript fragments to identify patterns in order to create profiles for scribes from particular places and times, imputing provenance based on evidence from transcription.

Gaining access to the thousands of “Paris” Bibles in global collections is unlikely in the near future, but we are using dozens of full- or partial-page fragments of New and Old Testament books from this tradition to create “fake,” composite manuscripts for fine tuning our methodology. In turn, our project has both an impact on the understanding of physical fragments of the Paris Bible and codices where localization and metadata are lacking, but also has the potential for larger computational codicology that supports the larger project of fragmentology (Davis, 2018; Flüeler, and Duba, 2018) for the lost book.

For any inquiries, suggestions, requests, the team can be reached at parisbible@gmail.com

Niccolò N. Cappelletto, Estelle Guéville and David Joseph Wrisley

Further readings and presentations:

You can also consult the bibliography of the project on Zotero.

Suggested citation

Cappelletto, Niccolò Acram, Guéville, Estelle, and Wrisley, David Joseph. (16 September 2021). Abstract - Combining Digital Fragments of Medieval Manuscripts for Creating Scribal Profiles. Paris Bible Project. https://doi.org/10.5281/zenodo.8040632

This post is published with a CC BY-SA-NC 4.0 International license.