Teaching Text Encoding In and Out of the Classroom

1. Panel Topic

Text encoding is one of the founding practices of the digital humanities, and it offers a productive way for our students and colleagues to begin engaging with DH. And yet, there is relatively little discussion about text encoding pedagogy at the annual Digital Humanities conference, the Text Encoding Initiative (TEI) conference, or in the secondary literature (see Allés-Torrent / Riande 2019; Apostolo et al. 2019; Brooks 2017; Cummings 2018; Cummings 2023; Estill 2016; Flanders et al. 2016; Flanders et al. 2019; Fritze et al. 2019; Fukushima / Bourrier 2019; Giovannetti / Tomasi 2022; Kaethler 2019; Neuber 2015). We believe that one reason for this absence is that the teaching of text encoding happens in different environments—undergraduate classrooms, graduate seminars, the training of research assistants, and workshops of varying lengths, intensities, and audiences—to different ends—literary analysis, text recovery, data mining, translation studies, digital editorial production, cultural reclamation, social justice—and with radically different kinds of texts—literary manuscripts, historical manifests, musical compositions, structured data, and more.

To address this lacuna, we have begun work on a book that foregrounds the pedagogy of text encoding, and this roundtable will allow some of the volume’s contributors to present their arguments and hear feedback from the community. Five panelists will each address teaching and learning text encoding in different contexts. Julia Flanders will consider the role that design and customization of encoding schemas have on student engagement with TEI. James Cummings will argue that it can be beneficial to deemphasize XML as the centerpiece of encoding education and to focus from the start on the TEI conceptual model itself. Martina Scholger, who frequently teaches workshops for early-stage researchers at the Institute for Documentology and Scholarly Editing, will reflect on best practices for both pedagogy and curricula development while also examining how the social and community-building aspects of such workshops are critical components of their pedagogy. Kiyonori Nagasaki will describe how Japanese scholars had to develop an academic environment in which TEI could be taught. And finally, Brian Croxall will consider how students’ contributions to a long-term encoding project inculcate in them the kinds of habits of mind required for textual scholarship.

Our plan for the session is as follows. First, each presenter will speak for five to seven minutes, outlining the key argument of their chapter for the volume. Second, the presider will lead a lightning round, with each panelist having a maximum of 60 seconds to respond to a question. The lightning round will have a maximum of two questions, which will be shared with panelists ahead of time. Our potential questions include the following:

What was the thing you had to teach yourself in order to teach text encoding?

How has teaching text encoding changed your relationship to the digital humanities?

How has teaching text encoding changed your relationship to your discipline? 

How does text encoding change your students’ lives/studies?

Third, the remainder of the time will be devoted to framed discussion among the panelists and the audience. 

2. Individual Papers

2.1. James Cummings

Many introductory TEI workshops adopt a similar approach to teaching text encoding. A brief introduction to markup and the rules of XML is followed by an exercise to reinforce the concepts of well-formedness and validity. The TEI requirements of basic document structure and required metadata are then introduced before moving on to other specialised modules of the TEI vocabulary. Often such workshops conclude with demonstrations of both how to customise and publish TEI documents.

While this general approach is common and effective, there are some unintended consequences. Participants trained in this manner sometimes focus on basic TEI structures and the provision of elements within the XML hierarchies, but only have a basic understanding of the underlying framework as a whole. As a result, they may find fault with straightforward aspects, such as the lack of availability of certain elements in some locations or the over-availability of elements elsewhere that are not needed for their specific goals. 

An alternative approach to teaching TEI involves separating the expression of TEI in XML from the conceptual modelling underlying the TEI framework. Workshops following this approach foreground the "why" and "what" of encoding tasks more. They would teach the customization of the generalised TEI framework upfront, after introducing the TEI abstract model and framework, but crucially before introducing XML markup. This approach seeks to develop a comprehensive understanding of the TEI framework, while also catering to users whose primary engagement with TEI documents is through content editing interfaces rather than the XML serialisation.

2.2. Martina Scholger

Despite the ongoing consolidation of digital humanities in university curricula, the demand for extra-curricular training and education remains high. Since 2008, the Institute for Documentology and Editing (IDE), an association of German and Austrian scholars, has regularly organized one-week schools on methods and technologies of digital scholarly editions. In addition to basic technologies such as XML, XSL, XQuery, and Python, these focus on particularly edition-relevant chapters of the Text Encoding Initiative (e.g., coding text criticism with apparatuses, manuscript description, handling primary sources, integrating facsimiles, and standard data) as well as general web technologies and XML publishing tools. The school’s curriculum is always adapted to the respective target group, which includes a range of learners from undergraduate and graduate students who are supplementing their regular university program to post-degree scholars who have specific needs related to their research programs. 

In this presentation, I will first discuss how IDE teaches text encoding and then report on what participants learn, drawing on surveys of all previous IDE school participants.  Were the participants able to apply their acquired knowledge in their professional careers? Which contents were useful, and which not so much? These and other questions will be asked in a new edition of a survey first conducted in 2019. The aim is to find out to what extent the IDE Schools have contributed to the dissemination of competencies in text encoding and the professionalization of the digital humanities in the German-speaking world and beyond.

2.3. Julia Flanders

The growth in DH curricular programs has created a diversified space for teaching text encoding, not only as a technical skill but also as a space for critical thinking about data representation. Text encoding may occupy an entire course, or a single unit in a course that covers other DH topics, or as a unit in a course largely focused on a humanities subject area. It may also play a central role in a thesis or dissertation that is framed around a digital edition, digital archive, or text analysis project. The choice of XML languages is a key curricular design issue. While the TEI has obvious salience given its prominence in DH, it is not the only choice; in some contexts, teaching schema creation may offer pedagogical advantages. And even if one chooses to feature the TEI, that decision in turn poses further questions. Does it make sense to use one of the publicly circulated customizations or a discipline-specific customization or a course-specific customization? How could one involve students in the creation of the customization? What would be involved in taking a comparative approach involving multiple customizations? How might that pedagogical process look different as part of a thesis or dissertation project, and how might an advisor help the student make the schema design process legible as a key element of the research? This paper will explore the pedagogical impact of these design choices and the different kinds of courses and other pedagogical situations where they may be particularly appropriate.

2.4. Kiyonori Nagasaki

Japan is a country where the digital humanities are thriving, yet the adoption of TEI was markedly delayed. This unfortunately resulted in a gap in the practical application between textual scholars and DH. For a healthy development of humanities in the digital age, text encoding education is vital, and currently, Japan is rapidly making efforts to improve the situation. This presentation discusses the paths that Japanese educators had to follow to bring text encoding into their classrooms.

The reasons for this delay were linguistically and academically based: because Japanese characters could not initially be handled by computers, few researchers were incentivized to pursue it; furthermore, academic policymakers saw the creation of digital research data, including text encoding, as the job of publishers or printing companies, rather than of humanities researchers. The first of these problems was relatively easy to overcome: enhancements to Unicode characters made working in Japanese much more possible. The second problem required a more complex solution: the creation of a community of scholars who valued encoding. A group of Japanese scholars created examples of encoding using Japanese primary texts; second, they translated the TEI Guidelines into Japanese; third, they taught text encoding to their colleagues; and fourth, they advocated for encoding to be regularly included in academic training. 

By discussing how text encoding and its practices had to be adapted to be taught in Japan, I hope to provide signposts for others who want to begin teaching text encoding in their own linguistic areas.

2.5. Brian Croxall

For several years, the students in one of my courses have contributed to a long-running text encoding project: a digital edition of Charles Schulz’s comic Peanuts. One of the benefits of encoding the comics is the way in which it causes individual students to read more slowly and closely. Altering the velocity of their reading alters what they see in a strip, helping them find places where it departs from norms established by its siblings. 

But such departures are difficult to encode since it’s not something that has previously been encountered. Working in a class is beneficial, as we collectively describe what happens in the strip and determine how to encode it. We frequently turn to the schemas we are using—both comic book markup language (CBML) and TEI and the project guidelines that my students have collaboratively authored. 

In this presentation, I explore what it means for a class to write its own guidelines for a multi-year encoding project. How do such guidelines interact with established schema? How do you decide what features should be tracked (settings, the shape of speech balloons) and which should be ignored (clothes characters are wearing)? How do such guidelines allow for the collective to change its mind and track something that had previously been ignored, and how do you go back to the work that was done years previously to incorporate the new decision? To what degree does encoding provide a per-force introduction to the habits of mind that are necessary for textual scholarship?

Appendix A

Bibliography
  1. Allés-Torrent, Susanna / Riande, Gimena Del Rio (2019): ‘The Switchover: Teaching and Learning the Text Encoding Initiative in Spanish’, Journal of the Text Encoding Initiative [Preprint], 12 < https://doi.org/10.4000/jtei.2994> [13.05.2024].
  2. Apostolo, Stefano / Börner, Ingo / Hechtl, Angelika (2019): ‘Collaborative Encoding of Text Genesis: A Pedagogical Approach for Teaching Genetic Encoding with the TEI’, Journal of the Text Encoding Initiative [Preprint], 12 < https://doi.org/10.4000/jtei.2926> [13.05.2024].
  3. Brooks, Mackenzie (2017): ‘Teaching TEI to undergraduates: A case study in a digital humanities curriculum’, College & Undergraduate Libraries, 24, 2–4: pp. 467–481 < https://doi.org/10.1080/10691316.2017.1326331> [13.05.2024].
  4. Cummings, James (2023): ‘The Present and Future of Encoding Text(s)’, in J. O’Sullivan (ed.) The Bloomsbury Handbook to the Digital Humanities. London: Bloomsbury Academic, pp. 147–157 < https://eprints.ncl.ac.uk/288895> [13.05.2024].
  5. Cummings, James (2018): ‘A world of difference: Myths and misconceptions about the TEI’, Digital Scholarship in the Humanities [Preprint] < https://doi.org/10.1093/llc/fqy071>. [13.05.2024]
  6. Estill, Laura (2016): ‘Encoding the Edge: Manuscript Marginalia and the TEI’, Digital Literary Studies, 1, 1 < https://doi.org/10.18113/P8dls1159715>. [13.05.2024]
  7. Flanders, Julia et al. (2019): ‘TEI Pedagogy and TAPAS Classroom’, Journal of the Text Encoding Initiative [Preprint], 12 < https://doi.org/10.4000/jtei.2144>. [13.05.2024]
  8. Flanders, Julia / Bauman, Syd / Connell, Sarah (2016): ‘Text Encoding’, in Constance Crompton, Richard J. Lane, and Ray Siemens (eds) Doing digital humanities: practice, training, research. 1st edition. New York, NY: Routledge, pp. 104–122.
  9. Fritze, Christiane et al. (2019): ‘010 Jahre IDE-Schools – Erfahrungen und Entwicklungen in der außeruniversitären DH-Ausbildung’. < https://doi.org/10.5281/ZENODO.4622233>. [13.05.2024]
  10. Fukushima, Kailey / Bourrier, Karen (2019): ‘Inside Digital Dinah Craik: Feminist Pedagogy, Cognitive Apprenticeship, and the TEI’, Journal of the Text Encoding Initiative [Preprint], 12 < https://doi.org/10.4000/jtei.2185> [13.05.2024].
  11. Giovannetti, Francesca / Tomasi, Francesca (2022): ‘Linked data from TEI (LIFT): A Teaching Tool for TEI to Linked Data Transformation’, Digital Humanities Quarterly, 16, 2 <https://www.digitalhumanities.org/dhq/vol/16/2/000605/000605.html> [13.05.2024].
  12. Kaethler, Mark (2019): ‘The TEI Assignment in the Literature Classroom: Making a Lord Mayor’s Show in University and College Classrooms’, Journal of the Text Encoding Initiative [Preprint], 12 < https://doi.org/10.4000/jtei.1804> [13.05.2024].
  13. Neuber, [Frederike] (2015): ‘Spring in Graz - Sunshine and X-technologies’. < https://doi.org/10.58079/NS3K> [13.05.2024].
Diane K. Jakacki (dkj004@bucknell.edu), Bucknell University, United States of America and Brian Croxall (brian.croxall@byu.edu), Brigham Young University, United States of America and James Cummings (James.Cummings@newcastle.ac.uk), Newcastle University, United Kingdom and Julia Flanders (j.flanders@northeastern.edu), Northeastern University, United States of America and Kiyonori Nagasaki (nagasaki@dhii.jp), International Institute for Digital Humanities, Japan and Martina Scholger (martina.scholger@uni-graz.at), University of Graz, Austria