Layout Analysis Dataset with Segmonto
Clérice, Thibault; Janès, Juliette; Scheithauer, Hugo; Bénière, Sarah; Romary, Laurent; Sagot, Benoit
Inria, France
This paper introduces semantic guidelines for analysing layout, advancing text structuration beyond sequential denoising, and a benchmark image set for training data extraction models. Although a detailed guideline description is planned, the paper provides an insightful preview of their potential and performance, utilizing 3000 training samples.
Text2MapAnnotations: An Automatic Framework of Generating Map Annotations from Textual Descriptions
Shao, Hanning (1); Yuan, Xiaoru (1,2)
1: Key Laboratory of Machine Perception (Ministry of Education), and School of AI, Peking University, China; 2: National Engineering Laboratory for Big Data Analysis Technology and Application, Peking University, China
HTML XMLIn this work, we propose a general framework for crafting annotated map. Based on this framework, we further propose an automatic method that can generate map annotations according to given textual descriptions, leveraging the large language models.
The Application of Large Language Models and Prompt Engineering in the Recognition of Geographic Entities in Ming Shi Lu
Ou, You-Chen (2); Chan, Ya-Chi (1,3); Tsai*, Richard Tzong-Han (1,2)
1: Center for GIS, Research Center for Humanities and Social Sciences, Academia Sinica, Taiwan; 2: Department of Computer Science and Information Engineering, National Central University, Taiwan; 3: Institute for Sustainable Heritage, University College London, United Kingdom
HTML XMLThis study advances digital humanities by applying Large Language Models for Named Entity Recognition in classical Chinese, focusing on the Ming Shi Lu corpus. It incorporates specialized prompt engineering and knowledge-based correction, demonstrating GPT-4's effectiveness in digital text analysis, a key contribution to the field of digital humanities.