Collation Formulas

Books in codex format—i.e. collections of sheets of any material, folded double and fastened together at the spine, and usually protected by covers—are formed of pages connected at the spine and not of floating pages in sequences, as portrayed in most digital projects. At the core of the technology of the codex lies the gathering (or quire): a group of folded or single leaves which can be used either singly or with other gatherings to create a textblock.

Gatherings sewn together to form a codex (spine view of UPenn LJS 236)

The gathering is in fact the ultimate working unit of the codex. The gathering structure of books in codex format has been recognized as an important aspect to be recorded and described in catalogues of both manuscripts and early printed books, and this is traditionally accomplished through collation formulas.

Collation formulas describe the sequence of bifolia (and singletons) within book gatherings. All formulas contain the same basic information, but this may be presented in a variety of ways, and their decoding in relation to the physical appearance of the object that they describe can prove challenging.

The following examples show different styles of collation formulas:

[1]   i, 1-9 (8), 10 (6), 11-20 (8), 21 (7), i

[2]   I-III8, IV10, V-IX8

[3]   IV(32), IV-1(40), 9 IV(120), IV-4

[4]   1-48, 52, 64-1, 7-1010

[5]   2°: πA⁶(πA1+1, πA5+1.2), A-2B6, 2C2, a-g6, χ2g8, h-v6, x4, “gg3.4″(±”gg3″), ¶-2¶6, 3¶1, 2a-   2f6, 2g2, “Gg6“, 2h6, 2k-3b6

Of these, the first four illustrate different patterns of collation formulas utilized for manuscripts, whilst the latter shows a bibliographical description of the gathering assembly of a printed book.

Formulas to describe manuscripts and printed books aim at the same scope: representing the gathering structure of a book in codex format; there are, however, some fundamental differences between the two schools. In manuscript studies collation formulas represent book structures exactly as they are, whilst bibliographical formulas represent the ideal copy of the printed book, and not the state of specific exemplars. In addition, manuscript studies—unlike the case of printed books and their bibliographical description—lack a standard for drafting collation formulas that is approved and employed by all scholars. As it can be seen in the examples above—[1] to [4]—some schemas use Roman numerals to signal the sequence of gatherings, whilst others prefer Arabic numerals; some use superscripts, and some show the number of pages in a group. Without being familiar with specific schemas, the interpretation of manuscript collation formulas can be problematic. Nonetheless, for the most part, both bibliographical collation formulas and the various styles of those employed in manuscript studies share a set of information units that are necessary to describe the arrangement of the sheets within textblocks.

Visualizing Manuscripts

Digitized medieval manuscripts are typically viewed through single-page or facing-page interfaces, which lack the physical cues present in a physical book, i.e., the size of the book, its thickness, details of the parchment or paper, etc. Indeed, even facing-page interfaces do not usually show a picture of book openings at all, but rather they are composites made with two images: one of the left-side page and another of the right-side page. These images would have been taken at different times. Typically all images of one side pages are taken first, e.g. all the rectos, then of the other side, and then file names or structural metadata are used to order the files correctly in post processing.

Screenshot from BiblioPhilly, Free Library of Philadelphia Lewis E 89, illustrating a typical page-turning view in a manuscript viewer

Most digital libraries provide some information on the pages depicted, and views other than single-page or facing-page: all provide information on the folio number and the side (recto or verso) shown; some indicate the quire number, and some offer a variety of viewing modes, including pages of thumbnails or thumbnails presented filmstrip-style across the bottom of a page. However, again, for the most part, the focus of these resources is on the page, rather than on the physical object.

In VisColl, we first model the collation of manuscripts in an XML format and then process that model in various ways, currently providing both diagrams and formulas, but potentially in other novel ways as well.

Screenshot of a visualization from VisColl 1.0

For instance, in addition to visualizing the physical structure of a manuscript, VisColl 2.0, currently under development, enables users to create taxonomies describing the content of the manuscript, and other elements, and then the system links those taxonomies to the physical structure, which produces a more robust and descriptive visualization than is possible in the current system. It will also make possible more complex visualizations than possible in VisColl 1.0.

Examples of elaborate gatherings. To the left, a diagram for gathering 7 of Marciana, Gr.VII, 22 (=1466). To the right, a diagram for gathering 1 of BAV, Ferr. 208. Note the different graphic representations for the attachment methods, and to signal uncertainties.

Integration of Collation Visualization

Collation visualization shouldn’t only be considered separately from traditional forms of manuscript viewing online. We recently integrated VisColl 1.0 collation visualizations into the Bibliotheca Philadelphiensis project, creating collation models as part of our cataloging process, generating collation formulas from the model for inclusion in the catalog record, but also integrating collation diagrams into the interface.


Andrist, Patrick, Paul Canart, and Marilena Maniaci. 2013. La syntaxe du codex: essai de codicologie structurale, 12-44. Bibliologia 34. Turnhout: Brepols.

Boudalis, Georgios. 2018. The Codex and Crafts in Late Antiquity. New York: Bard Graduate Center.

Campagnolo, Alberto. 2015. “Transforming Structured Descriptions to Visual Representations. An Automated Visualization of Historical Bookbinding Structures.” Ph.D. Thesis, London: University of the Arts London.

Campagnolo, Alberto, Dot Porter, Erin Connelly, Doug Emery, and Dennis Mullen. 2018. “Virtually Disbinding Codices: Visualisation of the Construction of Codex Textblocks.” In Care and Conservation of Manuscripts 16. Proceedings of the Sixteenth International Seminar Held at the University of Copenhagen 13th–15th April 2016, edited by Matthew James Driscoll, 77–90. Copenhagen: Museum Tusculanum Press; University of Copenhagen.

Corbach, Almuth. 2013. Table 1, pp. 27-33 in ‘Der Bernward-Psalter Im Wandel Der Zeiten. Eine Studie Zu Ausstattung Und Funktion’. In Der Bernward-Psalter Im Wandel Der Zeiten: Eine Studie Zu Ausstattung Und Funktion, by Monika E. Müller, 263–382. Wolfenbütteler Mittelalter-Studien 23. Wiesbaden: Harrassowitz.

Davis, Lisa Fagin. “Reconstructing the Beauvais Missal: Quire Visualizations.” (accessed 24/3/2020)

Dorofeeva, Anna. 2019. “Visualizing Codicologically and Textually Complex Manuscripts.” Manuscript Studies: A Journal of the Schoenberg Institute for Manuscript Studies 4 (2): 334–60.

Harnett, Benjamin. 2017. “The Diffusion of the Codex.” Classical Antiquity 36 (2): 183–235.

McDowell, Jesse. “An Ideal Collation of LJS 101.” Blog. The Schoenberg Institute for Manuscript Studies (blog). November 16, 2015.

Németh, András. 2015. Table 6, pp. 309-312, in ‘Layers of Restorations: Vat. Gr.  73 Transformed in the Tenth, Fourteenth, and Nineteenth Centuries’. Miscellanea Bibliothecae Apostolicae Vaticanae XXI: 281–330

Porter, Dot, Alberto Campagnolo, and Erin Connelly. 2017. “VisColl: A New Collation Tool for Manuscript Studies.” In Kodikologie & Paläographie Im Digitalen Zeitalter 4 | Codicology & Palaeography in the Digital Age 4, edited by Hannah Busch, Franz Fischer, and Patrick Sahle, 81–100. Schriften Des Instituts Für Dokumentologie Und Editorik 11. Norderstedt: Books on Demand GmbH.

Zappella, Giuseppina. 1996. Manuale Del Libro Antico, 197-233. Milano: Editrice Bibliografica.