An Imagined Geography of Empire: Mining cultural representations of the American colonial state during the St. Louis 1904 World's Fair

This digital history project utilizes methodologies of text data analysis and distant reading to assess how local newspapers produced their own discursive representations of the U.S. and the world in response to the ideologies of American colonialism and exceptionalism embedded on the grounds of the St Louis 1904 World's Fair. 1 (Anderson 2006; Douglas 1989: xviii) The project understands the Louisiana Purchase Exposition as a complex microcosm of early-twentieth-century modernity embedded with ritualistic competition, contradictions, and tense power relations between geopolitical entities. It pushes for closer scholarly attention to how newspapers, as intermediaries of fair makers’ ideological messages and visitors’ spatial experiences, engaged with and interpreted the language of empire and American colonialism at the fair.

Newspapers relied on the ways in which multiple audiences perceived and experienced the fair exhibits in order to write their stories and produce complex representations of participating cultures and the modernizing world. By attending to the cultural commentary about the fair through the use of digital methodologies, the project argues that, in response to the power relations and discursive negotiations embedded on the fairgrounds, newspapers contributed to an "imagined geography" of the modernizing world centered around the United States as an emerging, exceptional colonial power at the turn of the century. 2 (Anderson 2006; Blevins 2014: 122-147; Lefebvre 1991; Said 1979) They did so, first, by printing placenames of the United States and the Philippines more often than every other geopolitical entity participating at the fair (see Figure 1). Second, by fostering conversations about the Philippine exhibit as a center piece of the exposition and characterizing Filipino people as a nation under American tutelage and guidance towards civilization.

Figure 1. Figure 1 : "An Imagined Geography of Modernity" with extracted placenames from newspaper data. Visualization built in Tableau by the author.

The data in this project derived from newspaper clippings retrieved from the digital database Newspapers.com. The clippings were stored in JPG files that were then OCR'ed and processed as plain text data in RStudio. The collection was done through both random and proportional sampling, which means that the textual data is proportionally distributed across the three newspapers selected for collection ( The St Louis Republic, St. Louis Post-Dispatch, and St. Louis Globe-Democrat ). The data is also proportionally distributed across the seven months of the fair using a fix interval of 15 days. With a raw count of 196,336 words, the corpus serves as a significant sample of a larger process of data collection with potentially similar patterns to be explored for further research. Employing named entity recognition (NER) to extract the most frequent placenames in the corpus and using word embedding models (WEM) to explore the semantic relationships between words like “savage” and “civilization” reveals how conversations about the world’s fair in local newspapers contributed to the symbolic legitimation of the American occupation and colonial control over the Philippines.

Figure 2. Figure 2 : Word embedding model unveils how local newspapers relied on particular notions of race, civilization, and progress to generate discursive representations of the American colonial state. Graph built in RStudio by the author.

Some methodological choices and interventions on multiple levels of the research process – the data, the code, and the analysis – were necessary to mitigate issues of OCR errors, algorithmic bias, and limitations of the data. In the words of Shanon Leon, when it comes to humanistic inquiry, most data sets cannot “stand on their own without clear and thorough documentation that accounts for the many decision points along the way.” (Leon 2019: 10-11) Further, Stéfan Sinclair and Geoffrey Rockwell have argued in favor of the interpretive responsibility of humanists and historians engaging with text analysis and quantitative methodologies. They remind us that computational tools do not produce meaning; they are rather meant to “facilitate the augmented hermeneutic cycle.” (Sinclair / Rockwell 2016: 345). In this sense, without human intervention based on thorough knowledge of the input data and its historical context, the automated process of extracting named entities, for instance, would have risked misrepresenting particular geopolitical entities that participated at the fair. Beyond simply presenting the preliminary findings of this project, I hope to raise some of the methodological concerns regarding text mining for historical analysis that informed the scope of my research questions and the core argument of this project. I am currently collecting more data to expand the analysis and include other world’s fairs at the turn of the century.

Appendix A

Bibliography
  1. Afable, Patricia O. (2004) “Journeys from Bontoc to the Western Fairs, 1904-1915: The ‘Nikimalika’ and Their Interpreters,” in: Philippine Studies 52, 4: 445-473.
  2. Allwood, John. (1977) The Great Exhibitions. London: Studio Vista.
  3. Anderson, Benedict . (2006) Imagined Communities: Reflections of Origin and Spread of Nationalism . New York: Verso.
  4. Arnold, Taylor. / Tilton, Lauren Tilton. (2015) Humanities Data in R: Exploring Networks, Geospatial Data, Images, and Text . Quantitative Methods in the Humanities and Social Sciences. Heidelberg: Springer International Publishing. DOI: 10.1007/978-3-319-20702-5.
  5. Bender, Thomas. (2006) A Nation among Nations: America’s Place in World History . New York: Hill and Wang.
  6. Benedict, Burton et al. (1983) The Anthropology of World’s Fairs: San Francisco’s Panama-Pacific International Exposition, 1915. Berkeley: Scholar Press, 1983.
  7. Blevins, Cameron. (2014) “Space, Nation, and the Triumph of Region: A View of the World from Houston.” in Journal of American History 101, 1: 122–147. DOI: 10.1093/jahist/jau184.
  8. Blount, James H. (1973) The American Occupation of the Philippines, 1898-1912. First published in 1912. New York: Oriole Editions.
  9. Clevenger, Martha . (1996) Indescribably Grand: Diaries and Letters from the 1904 World’s Fair. St. Louis: Missouri Historical Society Press.
  10. Douglas, Susan J. (1989) Inventing American broadcasting, 1899-1922 . Baltimore: Johns Hopkins University Press.
  11. Findling, John / Pelle, Kimberly. (1990) Historical Dictionary of World’s Fairs and Expositions, 1851-1988. Westport: Greenwood Press.
  12. Gilbert, James. (2009) Whose Fair? Experience, memory, and the history of the great St. Louis Exposition. Chicago: The University of Chicago Press.
  13. Go, Julian / Foster, Anne L. (eds.) (2003) The American Colonial State in the Philippines: Global Perspectives . American Encounters/Global Interactions. Durham: Duke University Press.
  14. Greenhalgh, Paul. (1988) Ephemeral Vitas: The Expositions Universelles, Great Exhibitions and World’s Fairs, 1851-1939. Manchester: Manchester University Press.
  15. Grunder, Garel A. / Livezey, William E. (1951) The Philippines and the United States. Westport: Greenwood Press, Publishers.
  16. Guldi, Jo. (2023) The Dangerous Art of Text Mining: A Methodology for Digital History. Cambridge University Press.
  17. Hart, Jim Allee. (1961) A history of the St Louis Globe-Democrat . Columbia: University of Missouri Press.
  18. Hullman, Jessica / Diakopoulos, Nicholas. (2011) “Visualization Rhetoric: Framing Effects in Narrative Visualization,” in: IEEE Transactions on Visualization and Computer Graphics 17, 12: 2231-2240.
  19. Jurafsky, Daniel / Martin, James H. (2008) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 2 nd edition. Upper Saddle River: Prentice Hall.
  20. Klein, Lauren F. et al. (2021) “Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers” in: Journal of Cultural Analytics 6, 1: 1-43.
  21. Kramer, Paul A. (2006) The Blood of Government: Race, Empire, the United States, & the Philippines . Chapel Hill, NC: University of North Carolina Press.
  22. Leon, Sharon. (2019) “The Peril and Promise of Historians as Data Creators: Perspective, Structure, and the Problem of Representation,” in: [Bracket] (blog) https://www.6floors.org/bracket/2019/11/24/the-peril-and-promise-of-historians-as-data-creators-perspective-structure-and-the-problem-of-representation/ [06.13.2024]
  23. Ehrmann, Maud et al . (2023) “Named Entity Recognition and Classification on Historical Documents: A Survey,” in: ACM Computing Surveys 56, 2: 1-47.
  24. Moretti, Franco . (2013) Distant Reading . London; New York: Verso.
  25. Rafael, Vicente L. (2000) White Love: And Other Events in Filipino History . American Encounters/Global Interactions. Durham: Duke University Press.
  26. Rosenberg, Emily S. / Fitzpatrick, Shanon (eds.) (2014) Body and Nation: The global realm of U.S. body politics in the twentieth century. Durham; London: Duke University Press.
  27. Ross, Charles G. (1949) The story of the St. Louis Post-Dispatch . St. Louis: Post-Dispatch?.
  28. Ryan, Yann. (2021) A Short Guide to Historical Newspaper Data, Using R. <https://bookdown.org/yann_ryan/r-for-newspaper-data/#> [06.13.2024]
  29. Rydell, Robert W . (2002) All the World’s a Fair: Visions of Empire at American International Expositions, 1876 - 1916 . Chicago: University of Chicago Press.
  30. Rydell, Robert W. et al. (2000) Fair America: World’s Fairs in the United States. Washington, D.C.: Smithsonian Institution Press.
  31. Schmidt, Benjamin. (2015) “Vector Space Models for the Digital Humanities” <https://bookworm.benschmidt.org/posts/2015-10-25-Word-Embeddings.html.> [06.13.2024]
  32. Schudson, Michael. (1981) Discovering the News: A Social History of American Newspapers . New York, NY: Basic Books.
  33. Sinclair, Stéfan / Rockwell, Geoffrey. (2016) “Text Analysis and Visualization: Making Meaning Count” in: Schreibman, Susan et al. (eds.): A New Companion to Digital Humanities . Hoboken: John Wiley & Sons, Ltd.
  34. Smith, David A. et al. (2015) “Computational Methods for Uncovering Reprinted Texts in Antebellum Newspapers,” in: American Literary History 27, 3: E1-15.
  35. Vergara, Benito Manalo . (1995) Displaying Filipinos: Photography and Colonialism in Early 20th Century Philippines . Quezon City: University of the Philippines Press.
  36. Zelizer, Barbie (ed.). (2008) Explorations in Communication and History . Shaping Inquiry in Culture, Communication and Media Studies. London; New York: Routledge.
Notes
1.

Here, the understanding of newspapers as mediators that both influence and are informed by competing values, attitudes, and ideology in the discursive dimension relies on the work of Susan J. Douglas. The mediation resulted in new discursive representations of the United States and the world, and as per Benedict Anderson’s framework, contributed to shaping “imagined communities” and their modern geography.

2.

This project relies on Cameron Blevins’ terminology and framework to understand how newspapers “print, and thereby privilege, certain places over others.” Blevins relied on Henri Lefebvre’s notion of space as a social construct and Edward Said’s idea of imaginative geographies.” He also took into consideration Benedict Anderson’s work on imagined communities.

Lucas Avelar (ldavela@clemson.edu), Clemson University, United States of America