Dealing with Uncertainty in Digital History Research - Reflections on the Research Data Process

While in the natural sciences the term ‘uncertainty’ typically is predictable and even measurable, often referring to certain statistical probabilities. In the humanities it is used in a more varied and less standardized manner. Based on Gigerenzer et al. (2019:615) the first perspective is called "risk", which refers to "know[ing] all possible alternatives" , in opposition to uncertainty, when "we do not know the probabilities."

The term uncertainty may refer to unknown aspects within the historical discourse as well as to contradictions between or within the different sources. Often it just acts as a synonym for the 'unknown' without giving a statement about its probability or whether this unclarity is resolvable at all. It is also used to express ambiguity, for instance within a data set or among different sources. It seems like uncertainty in the humanities is neither predictable nor scalable. Nevertheless, in the realm of Digital History, uncertainty appears in all phases of the research process, as this lack of information or context can be particularly perceived in historical data and sources. Despite a growing awareness and recognition of this challenge, especially of temporal or spatial data, knowledge about methods or practices on how to handle or manage uncertainty in Digital History research is not established yet. As one crucial coping strategy, the method of visualization will be proposed on a data level as an information-spatializing structural element as well as an epistemic tool for the encoding and communication of uncertainty. Also , the method of visualization is strongly related to the digital dimension as Moretti and Sobhuk state: "If there is one feature that immediately distinguishes the digital humanities (DH) from the ‘other’ humanities, data visualization has to be it.” (2019:86).

This proposed panel aims to point out the relevance of the phenomenon of uncertainty especially for historical research and to discuss strategies to deal with this issue along the research data process in the following steps:

  1. Conceptual & theoretical foundations: This section will define the concept of uncertainty in general and its relevance in Digital History research to build a theoretical framework and give an introduction in the specificities of the discipline Digital History.
  2. Data collection & curation: The first step of the research process includes the collection of data by information retrieval performing database queries or traditional approaches like archival work. Uncertainty in this stage refers to incomplete, imprecise, ambiguous, biased, controversial or contradictory historical data.
  3. Data processing & modeling: In this phase the strategies for modeling uncertainty in data are elaborated in order to provide for more accessible data for both human users and machine processing.
  4. Visualizing, encoding & representing uncertainty: The representation of uncertainty in DH tools and interfaces creates many challenges, including the need to find trade-offs between too little and too much visualization complexity.
  5. Communicating uncertainty: In the last section the question of how to visually communicate historical uncertainty is dealt with by focusing on techniques that allow for ‘experiencing’ uncertainty rather than only quantifying it.

Individual talks

Navigating Uncertainty through Modelling in Digital History (Silke Schwandt)

Dealing with uncertainty is one of the main historiographical practices since historians construct stories based on oftentimes uncertain evidence. While this seems to be common knowledge, questions of uncertainty do not play a central role in methodological reflections. This does change with the advent of digital methods which make it necessary to reflect explicitly on presuppositions and formalizations that can be described as models of knowledge (Schwandt 2022).

Modeling is an important practice that is needed to facilitate the use of computational methods. Digital humanities itself has been described as “a practice of representation, a form of modeling or . . . mimicry.” (Unsworth 2000)

The models of knowledge in digital history refer back to the theories and traditions of historical research while digital historians also develop models for integrating the formalized thinking that is present in digital society: “Because computer simulation requires developing an explicit and precise model of a phenomenon being simulated, thinking of how cultural processes can be simulated can help us to develop more explicit and detailed theories of cultural processes.” (Manovich 2020). Historiography is all about dealing with uncertain evidence that we need to weave into plausible stories about the past. Essentially, we try to deal with this uncertainty in a mode of reducing it by designing parameters for historiographical modeling. Thinking on models helps to classify sources and weigh them according to our interpretations of the past.

Formalizing Uncertainty in Digital History: Benefits and Practical Limits of Theoretical Approaches (Torsten Hiltmann)

In classical hermeneutical historical studies, narrative and interpretative approaches are predominant from the outset. In contrast, digital and, more specifically, computational history initially require formalization and subsequent modeling to achieve processable and interpretable representations of information (Hiltmann 2023). Thus, while analog approaches permit uncertainties to be vaguely addressed or selectively avoided, in the digital world everything, including uncertainties, must be precisely captured and modeled in a manner applicable to all potential cases. To accomplish this, a deeper theoretical engagement with uncertainty is necessary, encompassing the capture of various uncertainties and the distinction between, for example, vagueness and ambiguity or different forms of decidability (Piotrowski 2019). In historical research this becomes even more complex, where it is necessary to incorporate factors such as context and multiperspectivity, leading us to question the kind of truth (and its uncertainty) we are modeling. Theorization, therefore, can lead to a more accurate (digital) representation of historical knowledge, potentially revealing uncertainties that have remained hidden so far due to enhanced processing and scalability. However, it is crucial to also discuss the extent to which the formalization of uncertainties remains practical and purposeful.

Multimodal data in the History of Education - Visual and semantic search interfaces as a strategy to deal with uncertainty (Linda Freyberg)

Data always documents and represents a certain cultural and historical origin and is symbolically embedded in this context and therefore it goes beyond its mere content. For the understanding of this embedment an analysis of this representation processes on a sign level following Peirce's sign theory, in particular the symbolic and iconic sign (CP 1.369) as well as the role of vagueness in the (diagrammatic) reasoning process (CP 2.444.) is proposed. Data of the History of Education, in particular the resources of BBF | Research Library for the History of Education of DIPF | Leibniz Institute for Research and Information in Education will serve as an example for the variety of multimodal sources and the levels of uncertainty in and between them.

The library holds a diversity of sources (images, texts, audiovisual media and 3D objects) from the 15th century until today, which range from estates and autograph collections to curricula to students’ drawings and pupils' newspapers to busts of famous educators and in particular the vast majority of the archival materials are not digitized yet. Based on the results of a co-creation workshop, which has been conducted in 2023, the idea of visual search interfaces using algorithms in order to provide for a precise semantic contextualization based on similarity is presented, while visual expressions allow a certain degree of vagueness in order to represent uncertainty.

Modelling Uncertainty in Cultural Data (Florian Kräutli)

Expressions of uncertainty are widespread in descriptions of cultural artefacts in digital collections. Cataloguers use various ways for representing uncertainties: symbols such as "?" or "[]", expressions of doubt or uncertainty such as "maybe" or "possibly", or verbal indicators of quantity such as "ca." or "around", the latter particularly when assigning dates.

The need for these improvised solutions arises from the fact that existing database applications for collections management are not inherently designed to capture uncertainties. Cataloguers resort to makeshift solutions that are interpretable by humans, but inaccessible to machine processing. In addition, those qualifiers merely communicate the fact that a piece of data is uncertain, but not the reason for the uncertainty. Whether an assigned date is, for example, derived from external sources, the art historical interpretation of a cataloguer, or an illegible inscription is not captured in the data.

This paper therefore asks: Can we represent uncertainties in data, along with their reasons, in a way that is interpretable for both humans and machines? Using Linked Open Data and the CIDOC-CRM ontology, we showcase practical models for dealing with uncertainties both in existing datasets as well as when entering new data. The focus is on not just indicating uncertainty but on providing context, making the data more accessible for both human users and machine processing.

Uncertainty visualization (Florian Windhager)

Given the characteristic panoply of uncertainties in humanities research data and topics, data visualization has to walk a fine line between encoding too little or too much of it. This paper will make the case for a multi-layered and user-centered approach to uncertainty visualization and DH interface design. For that matter, it will focus on the reference subject of visual(ization) complexity—and on the need to keep it within certain limits, especially with regard to casual or non-expert users of visualizations. While we consider the representation of data uncertainty necessary to increase the number and trust of expert users of DH tools in the arts and humanities, the increase in representation complexity also raises 'visual literacy' and learning demands, which adversely affects the same group of experts and even more so non-expert audiences (Windhager et al., 2019a, 2019b).

(Visual) Uncertainty communication, including narration Georgia Panagiotidou)

Data visualization is becoming a common research tool in humanistic inquiry. Nevertheless, besides its potential to enable new research directions, visualization is also critiqued for its epistemological incompatibility to humanistic principles as well as for its tendency to communicate data as objective and infallible. In our prior work examining humanistic digital scholarship, we identified that most uncertainty originates from missing, incomplete or conflicting data as well as the process of datafication. We also found that such uncertainty is for the most part only communicated in textual form and is by far excluded from the corresponding visualizations. In this talk I will discuss how while there is considerable research on visualizing uncertainty, state-of-the-art in uncertainty communication still mostly centers around probabilistic error. I will argue that for the purpose of historical visualization research, more techniques that allow for ‘experiencing’ uncertainty rather than only quantifying it are necessary so as to unmask visualization as an objective medium .

Appendix A

Bibliography
  1. Boyd Davis, Stephen / Vane, Olivia / Kräutli, Florian (2021): “Can I Believe What I See? Data Visualization and Trust in the Humanities”, in: Interdisciplinary Science Reviews 46 (4): 522–46. https://doi.org/10.1080/03080188.2021.1872874.
  2. Drucker, Johanna (2011): “Humanities Approaches to Graphical Display”, in: Digital Humanities Quarterly 5, 3.
  3. Hiltmann, Torsten (2023): “(Epistemologische) Grundlagen in der Anwendung digitaler Methoden”, in: Antenhofer, Christina / Kühberger, Christoph / Strohmeyer, Arno (eds.): Digital Humanities in der Geschichtswissenschaft, Stuttgart: 43-59 [in print].
  4. Kennedy, Helen / Hill, Rosemary Lucy / Aiello, Giorgia / Allen, William (2016): “The Work That Visualisation Conventions Do”, in: Information, Communication & Society 19, 6 (2 June 2016): 715–35.
  5. Panagiotidou, Georgia / Lamqaddam, Houda / Poblome, Jeroen / Brosens, Koenraad / Verbert, Katrien / Vande Moere, Andrew (2022): "Communicating uncertainty in digital humanities visualization research", in: IEEE Transactions on Visualization and Computer Graphics (2022).
  6. Manovich, Lev (2020): Cultural Analytics. Cambridge, MA: MIT Press.
  7. Moretti, Franco / Sobhuk, Oleg (2019): “Hidden in Plain Sight. Data Visualization and Digital Humanities”, in: New Left Rewiev 118 (2019): 86-115.
  8. Peirce, Charles Sanders (1931-1935) (1958): The Collected Papers of Charles Sanders Peirce. Vols. I-VI. Hartshorne, Charles; Weiss, Paul (ed.), Cambridge, MA: Harvard University Press, 1931-1935; Vols. VII-VIII, Burks, Arthur W. ed.) Cambridge, MA: Harvard University Press, 1958. [referenced as CP X.XXX]
  9. Piotrowski, Michael (2019): “Accepting and Modeling Uncertainty”, in: Kuczera, Andreas / Wübbena, Thorsten / Kollatz, Thomas (eds.): Die Modellierung des Zweifels: Schlüsselideen und -konzepte zur graphbasierten Modellierung von Unsicherheiten. Zeitschrift für digitale Geisteswissenschaften (Wolfenbüttel, 2019) < http://dx.doi.org/10.17175/sb004_006a > [15.06.2024].
  10. Priem, Karin / Fendler, Lynn (2019): “Shifting epistemologies for discipline and rigor in educational research: Challenges and opportunities from digital humanities“, in: European Educational Research Journal 18, 5: < https://doi.org/10.1177/1474904118820433 > [15.06.2024].
  11. Schwandt, Silke (2022): “Opening the Black Box of Interpretation: Digital History Practices as Models of Knowledge”, in: History and Theory, 61: 77-85. <https://doi.org/10.1111/hith.12281> [15.06.2024].
  12. Windhager, Florian / Salisu, Saminu / Mayr, Eva (2019): "Exhibiting uncertainty: Visualizing data quality indicators for cultural collections”, in: Informatics 6, 3: 29. MDPI. 2019a.
  13. Windhager, Florian, Salisu, Saminu / Schreder, Günther / Mayr, Eva (2019): "Uncertainty of what and for whom-And does anyone care? Propositions for cultural collection visualization", in: 4th IEEE Workshop on Visualization for the Digital Humanities (VIS4DH). Vancouver, Canada: 1-5. 2019b.
  14. Unsworth, John (2000): “What Is Humanities Computing (and What Is Not)?” (lecture, Distinguished Speakers Series, Maryland Institute for Technology in the Humanities, University of Maryland, College Park, MD, 5 October 2000), < https://johnunsworth.name/mith.00.html > [15.06.2024].
Linda Freyberg (l.freyberg@dipf.de), DIPF | Leibniz Institute for Research and Inf. in Education, Germany und Silke Schwandt (silke.schwandt@uni-bielefeld.de), Bielefeld University, Germany und Florian Windhager (florian.windhager@donau-uni.ac.at), Universität für Weiterbildung Krems, Austria und Georgia Panagiotidou (georgia.panagiotidou@kcl.ac.uk), King's College London, University College London, United Kingdom und Florian Kräutli (florian.kraeutli@uzh.ch), University of Zurich, Switzerland und Torsten Hiltmann (torsten.hiltmann@hu-berlin.de), Humboldt University Berlin, Germany