Discovering Pan-ecologicalness: A Data-driven Exploration of Ecological Discourse Transformation in 20th Century US Novel Corpus

There has been an abundance of research on the concept (Levin, 2011) and chronology (Westling, 2014) of ecoliterature after 1960s. Puchner (2022) combines ecocriticism with earlier literary history analysis, suggesting that the boundary of ecological discourse extended to other novelistic genres. In this project, we attempt to analyze patterns and transformation of ecological discourse in the 20th century through the lens of computational criticism, aiming to demonstrate how ecological discourse entered subgenres and pinpoint their consistency and deviation.

Methodology

It has long been noted that criticism and fiction writing share increasing interaction during 20th century (Eagleton, 2008). We design the experiment to reveal the interaction via computational methods, with an access to the US Novel Corpus constructed by the Textual Optics Lab of the University of Chicago*. Procedures (see Figure 1.) are divided into two sections: exploring the interaction between ecocriticism and fiction writing, and revealing the ecological discourse transformation.

Figure 1. Methodology procedures

The first section consists of 4 steps. First, 5 corpora of ecofiction, mystery, science fiction, gothic, and ecocriticism articles, were constructed with a total number of 11,860,863 words. Each corpus contained 50 representative novels of the subgenre based on The Norton Book of Nature Writing and Cambridge Companion to Literature series, while ecocriticism corpus held 60 articles from Keywords for Environmental Studies . Second, 556 words were selected as lexical representation of ecological discourse via word vector. Six words (i.e. agrarian, environmental, urban, ecology, humanity and nature) that constituted the highest TF-IDF result in ecocriticism corpus were chosen as “seed words” and used to extract words from the corpora based on semantic correlation, forming lexical representation of ecological discourse.Third, correlations among TF-IDF sequences in 5 corpora were examined, with the sequences representing the uniqueness of ecological discourse in different stylistics. Fourth, top 50 TF-IDF words in each sequence were analyzed to explore how affinity came into place.

The second section consists of 3 steps. First, a formula was constructed to measure the ecologicalness of fiction writing. Second, the ecologicalness of 7641 US novels in 20th century was calculated via the formula. Not only did we measure changes in the overall trend of ecological discourse in 20 th century, but also scrutinized texts that caused deviation so as to extract words and paragraphs where the deviation occurred. Third, the word cohort was refined based on results of correlation test, and comparison was made between 20 th century US novel ecologicalness under different word cohorts.

Results

First, the correlations among TF-IDF sequences of ecocriticism articles, ecofiction, mystery, science fiction and gothic can be shown as follows (See Figure 2).

Figure 2. Correlations among ecocriticism articles, ecofiction, mystery, science fiction and gothic

Figure 2 shows that TF-IDF sequence of ecofiction (0.147), mystery (0.137) and science fiction (0.118) significantly correlated with that of ecocriticism , while gothic (0.040) shows no significant correlation . It verifies that the 556 word cohort bears viable representation of ecological discourse. However, such prevalence only stands around 0.01-0.03, not to mention the fact that correlations between subgenres outstand that between ecofiction and ecocriticism. The results suggest that there may be a literary pattern, defined as “Pan-ecologicalness”, shared among ecocriticism, ecofiction, mystery and science fiction writing, which is so strong that it transcends boundaries between genres and stylistics.

Second, to explore “Pan-ecologicalness”, we analyzed top 50 TF-IDF words in each group (See Figure 3) to see diction used by different subgenres (Chen, 2020).

Figure 3. Top 30 TF-IDF words in each subgenre

The result shows that all subgenres share vocabulary of man’s ration (e.g., human, knowledge, sense, reason), social construct (e.g., society, history), signifier of nature (e.g., nature, world, earth) and ecological ethics (e.g., power, effect, action, equally). Apart from common novelistic diction, ecological discourse share a grander perception of the world (e.g., development, relationship, conservation) while also focusing on various aspects of ecology (e.g., creature, biology, specie, climate, origin), bridging the gap between human society and nonhuman environment. The findings imply that a general concern for mankind and its biological surroundings on sociocultural level can be detected, which is the core of contemporary ecoliterature (Buell, 2011). Therefore, it can be deduced that constituent genres of ecological discourse displayed a distinctive pattern of “Pan-ecologicalness” by sharing lexicon and thus shaping modern ecological ethics.

Third, the “Ecologicalness” formula (See Figure 4) uses type-token ratio (Ma et al ., 2019) to estimate the occurrence and richness of ecological lexicon defined as “unique words” (Sinykin et al ., 2019) so as to represent both frequency and semantic patterns .

Figure 4. “Ecologicalness” formula

Fourth, ecologicalness of each individual novel is measured via the formula. In addition to overall fitting trend, we calculated average ecologicalness and standard deviation of each year to pin down outliers within the corpus. Scatters and variance are illustrated in Figure 5 and 6 respectively.

Figure 5. Ecologicalness scatters, overall fitting trend and boundaries of deviation in 20th century US novels

Figure 5 shows that although the overall trend of 20 th century novel ecologicalness decreases over the period, the outliers with exceptionally high ecologicalness shows an opposite trend, rising dramatically after 1960. The result suggests that the distinction between a small amount of highly ecological novels and the rest of the texts widens over the period, forming a strong ecological shunt within 20 th century US novels.

Figure 6. Variance of ecologicalness of 20th century US novels

Figure 6 illustrates that variance of the annual ecologicalness decreased throughout 20 th century. Degradation in variance shows that novel ecologicalness gradually approaches annual average level, suggesting that genre stability increases over the period.

Finally, to cross validate the results and to better understand “Pan-ecologicalness”, ecologicalness was recalculated using the refined word cohort. Descending orders of texts under both circumstances were measured, and disparity was portrayed in the form of scatters (See Figure 7) .

Figure 7. Disparity of text ranking using Ecologicalness and Pan-ecologicalness word cohorts

It could be deduced that the decrease in word cohort is most influential in novels ranking at around 4000 to 5000, the median of 20 th century US novels. This is understandable because these peripheral ecological vocabularies constitutes the boundaries between ecological discourse and other relating subgenres, making them less noteworthy in texts with high ecologicalness and more prevalent in median texts that are mildly related. Through close reading of these texts, a deeper understanding of lexical and semantic constituents of “Pan-ecologicalness” is achieved.

“Pan-ecologicalness” discovered in ecofiction, mystery and science fiction suggests that ecological discourse interacts not merely with ecofiction, but with a wider range of literary writing. It offers a new perspective in describing complex, blurry literary relationships in 20th century, and challenges our thinking on how, where and why the concept circulates (Long and So, 2016). It motivates us to explore why ecocriticism turns out to be more influential than counterparts such as mystery and science fiction criticism despite stability in life cycles of all three genres (Underwood, 2016) in future studies.

Acknowledgement *

The authors are very grateful to Professor Hoyt Long at the Textual Optics Lab of the University of Chicago who encourages us to proceed this research and authorizes the access to The US Novel Corpus.

Buell, Lawrence (2011): “Ecocriticism: some emerging trends”, in: Qui Parle 19, 2: 87- 115.

Chen, Song (2020): “Writing for local government schools: authors and themes in Song-dynasty school inscriptions”, in: Journal of Chinese History 4, 2: 3 05-346.

Clark, Timothy (2011): The Cambridge Introduction to Literature and the Environment. Cambridge: Cambridge University Press.

Eagleton, Terry (2008): Literary Theory: An Introduction . Minneapolis: University of Minnesota Press.

Levin, Johnathan. (2011): “Contemporary Ecofiction”, in Cassuto, L. / Virginia, C./ Reiss, B. (eds.): The Cambridge History of the American Novel. Cambridge: Cambridge UP 1122-1136.

Long, Hoyt / So Richard Jean (2016): “ Literary pattern recognition: modernism between close reading and machine learning , in: Critical Inquiry 42, 2: 235-267.

Ma, Chuangxin / Liang, Shehui / Chen, Xiaohe (2019): Study on correlation coefficient and characteristic words of the Pre-Qin schools”, in: Journal of Chinese Information Process ing 33, 12 : 129-134.

McGurl, Mark (2011): “The Novel, Mass Culture, Mass Media ”, i n: Cassuto, L./ Virginia, C./ Reiss, B. (eds.): The Cambridge History of the American Novel. Cambridge: Cambridge UP 686-699.

Puchner, Martin (2022): “Preamble: Literature for a Changing Planet ”, i n: Puchner, M. (eds.): Literature for a Changing Planet. Princeton: Princeton University Press 1-11.

Rigby, Kate (2011): “Confronting Catastrophe: Ecocriticism in a Warming World ”, i n: Westling, L. (eds.): The Cambridge Companion to Literature and the Environment. Cambridge: Cambridge University Press 212-225.

Sinykin, Daniel / So, Richard Jean / Young, Jessica (2019): “Economics, race and the postwar US novel: A Quantitative Literary History”, in: American Literary History 31, 4: 775-804.

Underwood, Ted. (2016): “The Life Cycles of Genres”, Cultural Analytics : 1-25, < https://culturalanalytics.org/article/11061.pdf > [ 10.22148/16.005] .

Westling, Louise (2013): The Cambridge Companion to Literature and the Environment . Cambridge: Cambridge University Press.

Jiying Kang (kangjy21@mails.tsinghua.edu.cn), Tsinghua University, China and Wei Zhao (xiye1002@163.com), Chinese Academy of Social Sciences, China and Yufeng Han (hanyf21@mails.tsinghua.edu.cn), Tsinghua University, China