Unwrapping the Past: Digital Tools for Studying Hidden Texts
Chair: Ireland, Katherine

Document Classification to Select Appropriate Dictionaries for Morphological Analysis of Japanese Old Documents

Usui, Hisao (1); Komiya, Kanako (1); Ogiso, Toshinobu (2)

1: Tokyo University of Agriculature and Technology, Japan; 2: National Institute for Japanese Language and Linguistics, Japan

HTML XML

When analysing an unknown old text, even experts sometimes find it difficult to know when the document dates from. For this case, we propose using a document classification system to select an appropriate dictionary. We show that the proposed method significantly improves the performance of morphological analysis.


CATMuS - Medieval: Consistent Approaches toTranscribing ManuScripts : A generalized set of guidelines and models for Latin scripts from Middle Ages (8th-16th)

Pinche, Ariane (1,11); Clérice, Thibault (2); Chagué, Alix (2,4,6); Camps, Jean-Baptiste (3,7); Vlachou-Efstathiou, Malamatenia (4); Gille Levenson, Matthias (1,3); Brisville-Fertin, Olivier (1,5); Boschetti, Federico (8,9); Fischer, Franz (9); Gervers, Michael (10); Boutreux, Agnès (10); Manton, Avery (10); Gabay, Simon (12); O'Connor, Patricia (3); Haverals, Wouter (13); Kestemont, Mike (14); Vandyck, Caroline (14)

1: CIHAM–UMR 5648, Lyon, France; 2: ALMAnaCH - Automatic Language Modelling and Analysis & Computational Humanities, Inria, Paris, France; 3: ÉNC - École nationale des chartes, Paris, France; 4: EPHE - École Pratique des Hautes Études, Paris, France; 5: ENS de Lyon, France; 6: UdeM - Université de Montréal, Montréal, Canada; 7: CJM - Centre Jean Mabillon, Paris, France; 8: ILC-CNR; 9: VeDPH - Venice Centre for Digital and Public Humanities, Ca’Foscari, Venice, Italy; 10: UToronto - Department of History, University of Toronto, Canada; 11: CNRS, France; 12: UNIGE - Université de Genève, Switzerland; 13: Princeton University, New Jersey; 14: Antwerp University, , Belgium

The Consistent Approaches to Transcribing Manuscripts (CATMuS) aims at merging transcription practices for HTR across documents from the Middle Ages to contemporary times to help design general model(s). This paper specifically focus on the medieval and 15-16th centuries early print (incunabula and print with gothic typefaces).


Digital Humanities Approach to Comparing Tang and Song Poetry: Revealing Thematic Evolution Through Multiple Runs of LDA Topic Modeling

Lee, MinHeng; Liu, Chao-Lin; Shen, Hsin-Po

National Chengchi University, Taiwan

HTML XML

This study applies LDA topic modeling to 'Quan Tang Shi' and 'Quan Song Shi' to explore thematic differences in Chinese poetry. Using digital humanities techniques, we reveal distinct themes reflecting the spiritual pursuits of Tang and the societal focus of Song, enhancing our understanding of poetic evolution through computational analysis.