PBP Correct-a-thon Besançon 2023 #2: the case of BM Besançon Ms 5
PBP Correct-a-thon Besançon 2023: BM Besançon Ms 5
Fig. 1. Bibliothèque Municipale de Besançon, ms 5
Introduction
This post was written in the context of the Paris Bible Correct-a-thon which took place in Besançon, France in January 2023.
In this post, we intend to describe the revision process of the automatically generated transcriptions of two text columns in BM Besançon Ms 5. These transcriptions were obtained through the application of the Paris Bible Project model for Tranksribus (PBP 2.1)
Fig. 2. The front cover and spine of BM Besançon, ms 5, Photo by David Joseph Wrisley
Physical description of the BM Besançon Ms 5
The pages we transcribed came from a Bible held by the Besançon Municipal Library, described in the catalog as Ms 5 - Biblia sacra, ex translatine S. Hieronymi, cum epistola ad Paulinum, prologhi et capitulis. It is a rather large volume measuring 369x260mm, with a weight close to 4 kilograms. The cover and the later binding is in leather on wooden boards sealed with bolts (some of which are missing on the upper board), in a rather degraded condition. The manuscript’s vellum, uneven in their original density, is somewhat warped, giving a wavy appearance to the book’s spine. It is certainly these deformations that have caused the distortion of the text lines over time. We started to transcribe fol. 6v, at the beginning of Genesis. This page is characterized by a rubric, an inscription in red characters at the beginning of the text. This is followed by a long illumination made for the Lettrine of the first sentence, In principio creauit. This decoration is done in a repetitive floral pattern, apparently rinceaux, which is combined with two additional branches, one at the top of the text and one at the bottom, both starting with griffins, half lion and half bird. Mémoire Vive Besançon.
The revision of the transcriptions
During the correct-a-thon, we compared the transcription produced by the model against an edition of the Vulgate of Genesis available online. On the one hand, we noticed that the model was able to identify correctly most of the words, and that the most abundant abbreviations were ꝺ́s (deus), ꝙ (quod), eſt (est), ꝰ (us), fcm̄ (facem), eēt (esset), t́rā (terram), qꝫ (que):
Fig. 3. Word cloud of the most common abbreviations in the transcription of Genesis from BM Besançon 5. Visualized in Voyant.
On the other hand, the model often mistook the clusters in which an “r” was present for an “n” like in the word “ariꝺā” transcribed as “inda” or “mrēm” as “inrēm” or even “tigris” taken as “tigns”. Additionally, we observed an inversion in the order of “i” and “u”, like in “aꝺuitoꝛiuꝫ” being “aꝺiutoꝛiuꝫ” in the transcription.
In the line of difficulties related to abbreviations, two words showed up for which the symbol was absent from the Transkribus virtual keyboard, but was found in the extended version provided by the Expert Client installation: ẜm (secundum) and ẜpens (serpens), the long s with an extra line in the middle. ẜ is not a very frequent abbreviation, but is a significant one, it seemed to us, because of its appearance in the word referencing the representation of evil.
Regarding the generation of text regions, our team encountered a major problem, because the text region for the second column of every page overlapped the first, hence the software was unable to process them. That is the reason why we focus our efforts on revising the first columns of pages 9 and 10, achieving in this way the amount of text required for the correct-a-thon. Also, this inconvenience draws attention to the importance of double-checking each of the stages in this kind of workflow.
Fig. 4. BM Besançon 5 visualized in the Transkribus Expert Client with the pop up virtual keyboard.
Regarding other issues related to the use of Transkribus, learning to use the Expert Client version expanded our range of action, such as editing text regions, using the export button that allows creating txt, pdf, or xml in TEI format files, function that increases interoperability; as well as extending the above-mentioned virtual keyboard.
Analytical Observations
An analysis of the text of the manuscript with that of the editions of the Vulgate reveals that the significant role that abbreviations and special letter forms play in the manuscript text. Many of the medieval Latin abbreviations can be found in our manuscript (Ms. Besançon 5).
Fig. 5. BM Besançon 5, Genesis first column, fol. 6v.
Transcription | Edited Vulgate Bible |
---|---|
Expliciunt capitula libri geneſis Incipit liber geneſis. N pꝛincipio creauit ꝺeus celū et t́rā. T́ra aut̄ erat inanis ⁊ ua- cua. et tenebꝛe erant ſuꝑ faciē ābyſſi; et ſic̄ ꝺei fėbat˜ ſuꝑ aq˜s. Dixitqꝫ ꝺ́s. Fiat lux. Et ftā eſt lux. Et uiꝺit ꝺ́s lucem ꝙ eēt bōa; et ꝺiuiſit lucē a tenebꝛis. Appella uitqꝫ lucē ꝺiem; et tenebꝛas noctē. Fcm̄ qꝫ ē ueſꝑe et mane; ꝺies unꝰ. Dix̄ quoqꝫ ꝺ́s. | In principio creavit Deus cælum et terram. 2 Terra autem erat inanis et vacua, et tenebræ erant super faciem abyssi: et spiritus Dei ferebatur super aquas. 3 Dixitque Deus: Fiat lux. Et facta est lux. 4 Et vidit Deus lucem quod esset bona: et divisit lucem a tenebris. 5 Appellavitque lucem Diem, et tenebras Noctem: factumque est vespere et mane, dies unus. 6 Dixit quoque Deus. |
Conclusions
Being part of this project to develop a HTR model of for abbreviated medieval Latin has made us realize how difficult it is to create and improve a model. Such an international collaborative project gave us an opportunity to have hands-on practice to work with Transkribus and on the other hand, to understand how the scribes used to interact in the context of the manuscript tradition, and about how the preservation and diffusion of knowledge have been important preoccupations across centuries.
Finally, practicing with an AI related software was the opportunity to begin a path of learning and exploring of a completely new dimension of possibilities regarding the textual analysis in various forms that we might encounter in the future.
Team
Kateri Soulard, Sonaj Kailas, David Macchi.
Suggested Citation
Soulard, Kateri; Kailas, Sonaj; Macchi, David. (19 May 2023). PBP Correct-a-thon Besançon: Besançon Ms 5. Paris Bible Project. https://doi.org/10.5281/zenodo.8040632
This post is published with a CC BY-SA-NC 4.0 International license.