# What you really need to know about Digital Scholarly Editing

Today’s post is a short introduction to digital scholarly editing. I will explain some basic principles (so mostly theory) and point you to a few resources you will need to get started in a more practical fashion. I’m teaching a class on digital scholarly editing this term, so I thought I could use the opportunity to write an intro post on this important topic.

# How does a Digital Edition relate to an analogue scholarly edition?

Unlike analogue scholarly editons, digital editions are not exclusive to text and they overcome the limitations of print by following what we call a digital paradigm rather than an analogue one. This means that a digital edition cannot be given in print without loss of content or functionality. A retrodigitized edition (an existing analogue edition which is digitized and made available online), thus, isn’t enough to qualify as a digital edition because it follows the analogue paradigm. Ergo: It’s not about the storage medium. A text online isn’t automatically a digital edition!

Digital editions can be distinguished from digital archives by the degree of critical, scholarly engagement with texts. If it’s just a text corpus or facsimile, maybe with some good metadata, that still doesn’t make it a digital edition. For a digital resource to be considered a digital edition, it needs to satisfy the criteria of being a scholarly edition first (critical engagement with source text beyond mere representation of it), then making use of the unique opportunities offered by the digital medium. Sahle 2016 defines: “A scholarly edition is the critical representation of historic documents.” Representation means recording a document which can happen on multiple levels: visually as an image reproduction or facsimile, textually in the form of a transcription and also as metadata. This representation, however, isn’t the same thing as just presentation. It also has critical engagement (such as textual or historical criticism), i.e. making a text easier to understand by offering additional information to contextualize or interpret it. The edition should be of scholarly quality which is visible not only in the professional-looking output but also in the process of how it was created: transcription rules were used consistently, etc. (see the paragraph on quality criteria below).

Ideally, a digital scholarly edition should also implement the FAIR criteria, i.e. it’s findable in library catalgos with a persistent identifier, it’s acessible to all users for free and through it’s good usability, it’s interoperable by using a standardized data (encoding/exchange) format such as TEI which allows for the data to be reused by others, which is also the last point: reusability, i.e. data can be downloaded and licenses allow for reuse. This also includes that the data creation process needs to be documented in such a way as to make actual reuse realistic because others can understand what the hell you were thinking when you made those encoding decisions.

Also, you might want to learn more about minimal computing and minimal editions in this DHQ special issue.

## The single source principle and why it matters

In digital scholarly editing, the single source principle means that all representations of data are derived from one single source file. In DSE, this is usually an XML file following the Text Encoding Initiative (TEI) standard. This is a very efficient format for long-term archiving because XML is a text-based format, thus small (as in “not data-intensive” like for example image data is) and also both machine and human readable. Other formats, such as HTML for websites or LaTeX for print focus on how to render the information encoded in the XML in an output format. Accordingly, if – for instance – we wanted to create a web version of our edition, we would use the transformation language XSLT to transform our XML source file to an HTML output which can then be rendered by a web browser and also, put online for others to view.

A common misunderstanding for people who first do this is to think that the .html is automatically published to the web because once you click the file, it will be opened in your browser. However, the address line shows that’s it’s actually just a local file on your computer. So don’t panic 😉

So anyway, TEI-XML is for semantically describing your data in a lot of detail. For example, an italicized text may be encoded as follows <hi rend="italic">my italicized text</hi>. If you display the XML file, it will ever only just show this code, not actually display italicized text. In HTML, a command such as <i>my italicized text</i> will actually be rendered as italicized, or in LaTeX, for example \emph{my italicized text}. This is the principle of the separation of form and content which is inherent in XML markup. Our TEI-XML just describes our data (possibly noting a lot more detail than will eventually be shown on the website we create) but the design information needs to come from elsewhere.

Anyway, using the single source principle means that you only need to long-term archive the base data and not all its representations as well. Those can be automatically generated from the source data.

# Quality criteria for digital scholarly editions

In order to lend credibility to digital scholarly editions and set them up in a structure that will intrinsically ensure their high quality, there are a number of best practices to follow. In fact, there is a scholarly journal dedicated solely to reviewing digital scholarly editions according to a set of criteria published by the journal (–> RIDE). Here are the reviewing criteria which are available in a number of languages.

Some of those are, for example, that your data is long-term archived sustainably (an institutional repository is better than a commerical venue such as GitHub which could theoretically be gone tomorrow). Also, the fact that you have used TEI-XML is already such a criterion because using a standardized format ensures that your data is interoperable with other people’s data. For that to be possible, you data should ideally be downloadable so others can reuse it and also licensed in a way that allows for this to happen. It’s also important that as many of the decisions that you made during the edition process are documented in detail (such as encoding descisions, transcription guidelines that you hopefully followed consistently, revisions of the digital file etc.). You should fill out the <teiHeader> metadata in appropriate detail and provide a citation suggestion (ideally including a persistent, stable identifier) for your edition so that other scholars can refer to it in the future.

But these are just a few examples. There are many more aspects to this. If you want to dive in deeper, read the reviewing guidelines and some of the reviews in the RIDE journal and maybe practice your skills by writing a review for a resource from your field (pick one from here but if you actually want it published, contact the editors first before investing a lot of work).

# Next steps

If you want to create a digital edition in practice, here are a few things you’ll need to do:

1. Learn XML and TEI. The TEI has lots of specialized options for encoding digital scholarly editions (such as critical apparatus and so on). Get started here: A shamelessly short intro to XML for DH beginners (includes TEI)
2. Decide if and how your digital edition is supposed to be published and where it will be archived (ideally it should be an institutionally backed long-term archive to ensure your data’s sustainability).
3. Pick a publication tool (I might make a post about some of them at some point in the future) or learn to transform your data into the desired output format (for example Enough reledmac to be dangerous: Scholarly Editing with LaTeX & XSLT or HTML for a website, I’ll follow up with a basics intro on that as well). To learn how such transformations work, you might want to look at the First ever LaTeX Ninja workshop at Harvard: “Beyond TEI: Digital Editions with XPath and XSLT for the Web and in LaTeX”. This blogpost has all the teaching materials so you can use this for self-directed learning.

I guess there would be much more to say about digital scholarly editions but this was supposed to be a short intro. So this is what I leave you with today.

And remember: Don’t panic!

See you soon,

the Ninja

# Resources

1. On Minimal Computing: Roopika Risam & Alex Gil, “Introduction: The Questions of Minimal Computing” in Digital Humanities Quarterly Special Issue 16/2 (2022). http://www.digitalhumanities.org/dhq/vol/16/2/000646/000646.html
2. Distinction between digital archive and digital edition: Patrick Sahle. “Digitales Archiv und Digitale Edition. Anmerkungen zur Begriffsklärung”. In: Literatur und Literaturwissenschaft auf dem Weg zu den neuen Medien. Ed. by Michael Stolz. Zürich, 2007, pp. 64–84.
3. Must read intro on “What is a digital edition?”: Patrick Sahle. “What is a Scholarly Digital Edition?” In: Digital Scholarly Editing: Theories and Practices. Ed. by Matthew J. Driscoll and Elena Pierazzo. Cambridge: Open Book Publisher, 2016, pp. 19–39. URL: https://books.openedition.org/obp/3381.
4. Reviewing criteria by the RIDE journal: Criteria for Reviewing Scholarly Digital Editions, version 1.1 Patrick Sahle; in collaboration with Georg Vogeler and the members of the IDE; Version 1.1, June 2014 (Version 1.0, September 2012 – January 2014; German version 1.1: http://www.i-d-e.de/publikationen/weitereschriften/kriterien-version-1-1/). Major contributions to this English version by: Misha Broughton, James Cummings, Franz Fischer, Philipp Steinkrüger, Walter Scholger. https://www.i-d-e.de/publikationen/weitereschriften/criteria-version-1-1/
5. Practical approach to learning XSLT for making digital scholarly editions (includes an intro to “What is a digital scholarly edition?” as discussed in this post): First ever LaTeX Ninja workshop at Harvard: “Beyond TEI: Digital Editions with XPath and XSLT for the Web and in LaTeX”