Closing the gap: a user-centered approach to developing text analysis pedagogy

Text analysis has the potential to revolutionize research across disciplines. However, a massive hurdle facing those eager to unleash its power is that the coding skills required for text mining can take years to develop. This gap between the no-code and code learning experience has been realized by many as a pain point in text analysis pedagogy (Dombrowski 2023; Underwood 2018). Recently at Constellate, a platform for teaching, learning and performing text analysis from non-profit ITHAKA, users are seeking site features to help them explore text analysis before learning to code ( internal testing, 2023). To address this need, Constellate is developing “pre-code” tutorials using Streamlit, an open-source platform for creating data applications with the programming language Python, that demonstrate different text analysis techniques. We hypothesize that for a user to get ready for coding, they require accessible introductions to text analysis and programming concepts, using user-centered design principles (Norman / Draper 1986). Thus, we plan to introduce key concepts in text analysis and Python programming through low-pressure, pre-code tutorials before directing users to code-required features, such as Constellate’s cloud-based lab.

Constellate is developing a template for pre-code tutorials to familiarize users with common text mining vocabulary and processing techniques. Each pre-code tutorial will scaffold an introduction to a text analysis concept, including a case study of its use and how to process an input, create an output, and download the output for further use. These pre-code tutorials will also serve to introduce Python elements without the code. For example, Streamlit supports the use of pseudo-code, which can introduce a Python concept such as a “for loop” by allowing a user to visualize its iterative nature intuitively without becoming overwhelmed by Python syntax. We hypothesize that integrating pseudo-code into the workflow will also allow users to transition more smoothly into the code tutorials run in Constellate’s cloud-based lab. To crystallize the practice concept, users will be able to select additional ways to import their own text file, multiple text files, or a curated Constellate dataset. By offering multiple avenues for a user to practice a text analysis concept, we expect users will become more conceptually confident with transforming textual data at different scales. 

Our design approach borrows from user-centered design to center the perspective of a learner who is new to text analysis by using clear vocabulary, consistent interaction patterns, and accessible testing. We plan to test a prototype of our tutorial on concordance analysis, a method for creating a list of all the occurrences of a particular word within a text. We will interview 5-10 users to address usability issues and identify user satisfaction with the template prior to development (Nielsen / Landauer 1993). We will recruit diverse user voices to interview, to support an accessible experience for all users with diverse technology preferences (e.g. mouse, keyboard, or screen reader), roles, and skill levels. We will document how successfully users are able to locate and navigate the pre-code tutorial to understand how users want the tutorials to be organized within the rest of the site’s information architecture. We will use the evidence gathered from user testing to iterate on our design with speed, saving us time during development informed by a confident user perspective.

In this poster session, we will illustrate the design principles behind development, review top findings from user testing, and demonstrate Constellate’s first pre-code tutorial on concordance analysis. We look forward to interacting with attendees to exchange ideas and gather additional feedback on the tutorials. Constellate’s pre-code tutorials are intended to harness the power of user-centered design to close the gap between no-code and coding learners. In doing so, we hope to inform a replicable, pedagogical approach to accelerating the adoption of text analysis techniques by users across the humanities.

Appendix A

Bibliography
  1. Dombrowski, Quinn (2023): “Does Coding matter for doing digital humanities?”, in: O’Sullivan, James (Eds.), The Bloomsbury Handbook to the Digital Humanities New York: Bloomsbury Academic 137-146.
  2. Nielsen, Jakob / Landauer, Thomas K. (1993): “A mathematical model of the finding of usability problems”, in: Proceedings of the INTERACT'93 and CHI'93 conference on Human factors in computing systems : 206-213.
  3. Norman, Donald A. / Draper, Stephen W. (1986): User Centered System Design: New Perspectives on Human-Computer Interaction . Hillsdale, NJ: Lawrence Erlbaum Associates.
  4. Underwood, Ted (2018): “A broader purpose: Studying culture with numbers is not a special subfield of ‘DH’; it’s a way to integrate different aspects of a liberal education” in The Stone and the Shell <https://web.archive.org/web/20240524112324/https://tedunderwood.com/2018/01/04/a-broader-purpose/>.
Zhuo Chen (zhuo.chen@ithaka.org), Constellate/ITHAKA and Grace Cope (grace.cope@ithaka.org), Constellate/ITHAKA