Legal Aspects of Generative AI and their impact on Digital Humanities

At least since the launch of Chat GPT in November 2022, Artificial Intelligence (AI), and the application of generative AI in particular, has been at the very center of attention for research and pedagogical activities, also in the DH community. AI tools are being used to generate multimedia and multimodal content, ranging from plain text to images, to videos, to code. AI-generated outputs are increasingly difficult to distinguish from human-created content, and are therefore of increasing commercial value, sometimes entering into (unfair) competition with human creators. AI-generated data are also used in research, as a subject of study, but sometimes also as a substitute for human-generated data.
At the same time, the research and education community is struggling with how and to which extent AI can and may be used responsibly to support and improve our work, and how the contributions of AI can and must be labelled to ensure good scientific practice.

However, generative AI does not fit in the currently existing legal framework, which causes unavoidable friction between its stakeholders, developers and users. Efforts at regulating AI have been launched on both sides of the Atlantic to introduce new rules for AI. For example, in the European Union AI Act has been proposed and will probably become applicable legislation by the time this panel takes place. In the US and Canada, a series of public consultations on the issue has been launched, which may lead to a revision of copyright law and other rules.

This panel is intended to inform the DH community about the latest developments in the field of law applied to AI, and discuss their potential impact on the humanities, especially regarding efforts of governance and the responsible use of AI in research and education. The panel features presentations from experts from both sides of the Atlantic, whose presentations will address issues of international relevance. 

1. Kim Nayyer: AI, big data, and knowledge appropriation and protection

Use of AI tools and processes in DH explorations raises complex legal uncertainties. Potential legal minefields may implicate DH activities both as outputs of AI processes and as inputs in the creation and use of AI tools. Governments have invested in examination of these issues to surface practical concerns and found potential solutions like regulatory measures or industry-facilitated guidelines. This paper explores some legal issues raised–to greater and lesser extents–by regulators, researchers, scholars, technology providers, and others. The paper focuses on legal propositions of potential interest to DH researchers and consultation and policy processes launched in the U.S. and Canada. These include legal foundations of infringement assertions where copyright-protected works may form some part of training data for AI models or machine learning tools and processes that DH researchers use in their research. Likewise, researchers may generate text, images, or other data that may, without the researcher’s knowledge or agreement, contribute to development of LLMs or diffusion models used for a range of output generation. Less discussed and equally concerning is that researchers’ iterative and knowledge-driven use of various tools may contribute to building and refinement of AI processes underlying them. This paper examines questions about copyright and about uses and processes that entail data and knowledge appropriation less easily describable in existing intellectual property legal categories. The paper explores these questions with reference to disparate equities and power positions in current conversations about copyright subsistence, protection, and fair dealings, against a backdrop of digital capitalism and data commodification. 

2. Walter Scholger: Assessing the Impact of the EU AI Act on Digital Research and Education

As research and education - in and beyond the Digital Humanities - increasingly rely on digital sources, methods and publications, the European Union AI Act will emerge as a pivotal regulatory framework with profound impact. 

Since the application of AI in education is considered a "high-risk" scenario under the framework proposed by the AI Act ( COM/2021/206), assessing the Act's effects on digital teaching methodologies and its influence on the development and deployment of AI tools for educational purposes will be a key focus. Restrictions and regulations will address quality assurance and good academic practice, but may impact the creative and innovative application of AI systems. Beyond legal regulations, the AI Act and related efforts at governance - e.g. the Council of Europe Framework Convention on Artificial Intelligence and Human Rights, Democracy and the Rule of Law (CM(2024)52) - will address ethical, societal and philosophical questions.  

The Act's provisions and stipulations concerning research ethics, data protection and ownership, and intellectual property rights will be contemplated - especially in light of the seemingly broadly formulated exemption from prohibitions for research, as long as it is conducted "lor legitimate purposes" and "in accordance with recognized ethical standards for scientific research" ( COM/2021/206).

3. Paweł Kamocki: AI and the presumption of authorship

Presumption of authorship (Art. 15 of the Berne Convention) is a fundamental principle of copyright law. According to this presumption, the person whose name is indicated on a work “in the usual manner” is, in the absence of proof to the contrary, to be regarded as its author. The aim of this principle is to facilitate proof of authorship, which would often be difficult otherwise, and place authors in a favorable position in a copyright infringement lawsuit, by shifting the burden of proof of authorship towards the defendant.

However, the presumption produces rather undesirable effects when applied to texts generated by AI tools, especially when combined with a presumption of originality. Such texts are virtually indistinguishable from human-written texts. Because of lack of human authorship, they should not be protected by copyright, as they fail to meet the originality criterion. Nevertheless, it is enough for a user of such a tool to sign his or her name on the output to benefit from a strong presumption of authorship, and be enabled to institute copyright infringement proceedings.

This presentation will discuss practical implications of the presumption of authorship applied to AI-generated works, as well as the ethical and legal consequences of presenting these works in a way that may lead the audience to believe that they were in fact created by a human author (e.g., signing them with one’s name). Furthermore, the presentation will also discuss potential solutions to the problem, and their impact on the publishing and research sectors.

4. Koraljka Kuzman Šlogar: Above the Law: Ethical Imperatives in the Era of AI-Generated Content

In the era of global information interconnectivity, technological advancements in one country can have broader implications globally, and this is particularly evident and “tangible” in the field of AI development. What is crucial is the fact that AI-generated content can have significant societal consequences, including influencing politics, public opinion, and culture. Biased algorithms have the potential to exacerbate additional social inequalities and biases while on the other hand, AI can also assist in addressing and resolving inequalities and disadvantages of its users. On the other spectrum of ethical considerations, we can pose the question of authorship in the context of artificial intelligence. Tools are increasingly involved in the content creation process, giving rise to numerous ethical questions that touch the core of creativity, authenticity, and researcher responsibility.

Laws are being put in place to regulate the legal aspects of using AI. But it is questionable whether they can cover all aspects of ethical challenges that arise. Therefore, the digital humanities community, by developing and promoting shared global ethical guidelines, has the opportunity (and responsibility) to help define standards of governance and conduct which contribute to shaping a positive and sustainable impact of AI-generated content on our society. 

Another important consideration in the disciplinary field of Digital Humanities concerns the Act's provisions on data ownership, transparency, and accountability in reference to the potential impact on collaborative research initiatives and cross-border developments and exchanges (especially with non-EU countries). The presentation will provide an overview of the European Union AI regulations and attempt to describe the practical ramifications on Digital Humanities research and education.

Walter Scholger (walter.scholger@uni-graz.at), University of Graz, Austria und Pawel Kamocki (kamocki@ids-mannheim.de), Leibniz-Institut for German Language, Germany und Koraljka Kuzman-Slogar (koraljka@ief.hr), Institute of Ethnology and Folklore Research, Croatia und Kim Paula Nayyer (kpn32@cornell.edu), Cornell Law School, USA