The Fashion Calendar Research Database: Digitizing and Parsing 70 Years of American Fashion

Abstract: This paper will report on techniques and critical methodologies used in the Fashion Institute of Technology’s recently launched Fashion Calendar Research Database. The project is an innovative open-source research tool that enables experimental approaches to the understanding and study of the fashion and creative industries throughout the twentieth-and-early-twenty-first centuries.

Paper proposal:

In November 2023, The Fashion Institute of Technology, State University of New York (FIT) launched The Fashion Calendar Research Database (FCRD) , an open-source research database that digitized three publications published by Ruth Finley (1920-2018). The digital humanities project integrates tools that enhance discoverability, accessibility and quantifies the seventy years of event-based data about the fashion and creative industries throughout the twentieth and early twenty-first centuries. The launch marked the culmination of a three-year digitization and digital humanities project titled “The Ruth Finley Collection: Digitizing 70 Years of the Fashion Calendar.” The project was supported by a “Digitizing Hidden Collections” grant from the Council of Library and Information Resources (CLIR). The grant program is made possible by funding from the Mellon Foundation. This presentation will discuss the project design, report on technological challenges, discuss the development of codes using Artificial Intelligence and machine learning and the potential biases and margins of error that occurred, combined with library science theories and methodologies like critical cataloging.

The Ruth Finley Collection

The Ruth Finley Collection was donated to Special Collections and College Archives (SPARC) at the Gladys Marcus Library at FIT in 2015, and comprises of three trade publications, Fashion Calendar (1941-2014), Home Furnishings Calendar (1947-1951) and Fashion International (1973-2008). The most noted of the three in this unique collection is, Fashion Calendar, the central calendar and scheduling service for the American fashion and creative industries for over seven decades. When the semi-annual national fashion market weeks (New York Fashion Week) was centralized by the Council of Fashion Designers of America (CFDA) in 1993, Fashion Calendar became the official calendar of NYFW. After Finley’s retirement in 2014, she sold the publication, and the rights to schedule NYFW to the CFDA, who continues to run the CFDA Fashion Calendar, now only digital and published exclusively for the market week events.

Finley’s Fashion Calendar was a serialized weekly until 1983, and thereafter, was published twice monthly. The collection includes almost 3000 issues and amounts to upwards of 39,000 pages of material in a consistent text-based format. The Fashion Calendar is a particularly valuable resource for its impact, longevity, and role in structuring the polyrhythmic time system of the American fashion and creative industries. Listings in the calendar were submitted by subscribers or non-subscribers and were aimed at the press and internal industry and reflected the many adjacent American industries such as retailing, footwear, cosmetics, millinery, menswear, textiles and more. Further, the Fashion Calendar also includes the international fashion week calendars and trade events taking place around the world as the fashion industry became increasingly globalized in the late twentieth century. Importantly it included the social calendar and reflected the culture of the fashion and creative communities in the United States and abroad. Finley’s calendar took a democratic approach and did not operate as a gatekeeper to participation but sought to accurately reflect the fashion schedule and significantly worked as a service to help subscribers “clear dates” and avoid conflicting events during key market and presentation weeks.

What makes Fashion Calendar and its unique long-time publisher Ruth Finley important was that unlike European and foreign fashion industries, the American calendar was run independently. After the CFDA acquired the publication, Finley donated her collection to FIT with the hope that it would be preserved and used by students and researchers.

The Fashion Calendar Research Database

Project co-PI Natalie Nudell is the foremost expert on the American Fashion Calendar. She wrote and produced a documentary on Finley and the American fashion industry titled Calendar Girl (2020) and is the author of the forthcoming historical monograph titled In American Fashion, Ruth Finley’s Fashion Calendar (Bloomsbury Visual Arts, September 2024). Nudell began a collaborative project in 2020 with Karen Trivette, Head of SPARC at FIT (co-PI) and Joseph Anderson, Digital Initiatives Librarian to digitize the collection, which transformed into a more complex digital humanities project that aimed to apply digital tools to quantify the information in the publication and expand accessibility and discoverability of marginalized and traditionally underrepresented groups and their participation over time. To realize these goals, the data within the calendars, the who, what, when, where how and any descriptive information needed to be extracted and parsed so additional identity attributions could be linked or tagged with individual people or entities who were listed in the publication. This approach was informed by concepts of critical cataloging discussed by Emily Drabinski and discussions about how the controlled vocabulary that the project team was creating could reflect contemporary discourses about identity, social justice, and representation (Drabinski, 2013). The project team sought to contribute to the growing body of critical digital fashion studies resources such as the Fashion and Race Database and Europeana .

After the project was awarded funding, all the issues in the collection were scanned and digitized by an experienced archival scanning vendor and every page of the scanned material is now available on a page viewer (IIIF viewer) on the FCRD. This allows users to flip through pages of each issue, conduct full-text searches, download issues or pages, generate the bibliographic citation, indicate the metadata, and copyright information for each page in the collection.

The initial project design relied on the standard Optical Character Recognition (OCR) software to capture the text within the publication and transform the listings in each issue into searchable data. Before the development of computer word processing programs, Fashion Calendar was typed on a typewriter and copied by using a mimeograph machine. Due to its analog formatting and column overlaps the OCR files were returned jumbled which made data parsing difficult. The project team encountered numerous challenges in the database development stages which brought up theoretical and methodological questions that we were able to work through thanks to the integration of innovative AI technologies.

Thanks to the final scanning costs coming in under budget, the project team was able to divert the remaining funds, less than 10% of the overall budget, towards finding a solution to the extraction issue. The project team identified and hired a company, Explor.ai based in Canada, to develop a code to parse and extract the data in the collection. The Explor.ai team used text extraction through specialized OCR (commercial) systems and devised an extraction method using unsupervised learning algorithms. This technique created visual clusters out of nearby elements (individual words), based on Euclidean distance. These groups could be viewed for validation. After the recursive merging of similar observations, the resulting clusters neatly divided documents into rows, columns, merged cells, etc. A classification method was then used to determine the characteristics (ex. number of columns) for each page section, to adopt the proper pre-trained model. The resulting paragraphs of connected text were then searched (mostly with simple RegEX solutions) to structure the final information (such as dates, locations, presenters…) ( Caron-Lizotte, 2022). The application of the AI code enabled almost 200,000 individual listings to be extracted and parsed with a high rate of accuracy.

With the extracted data, our project team, which included trained student interns, normalized entities and individuals within the data, and attributed identity information to people and organizations that include, geographic location, gender, race, sexual orientation, among others. This approach was informed by taking a feminist and global approach to digital humanities which seeks to highlight the impact of groups that are unseen or underrepresented within the historical record (Earhart, 2023; Gairola, 2023; and Risam, 2023). Thousands of named entities (brands, designers, etc.), were enhanced with identity and category attributions/tags based on primary and secondary source research. The project relied on several publicly available tools and applications for this portion, notably OpenRefine, Leaflet, Google Maps Platform, MongoDB Atlas, Mirador Viewer, and Chart.js. The extracted and parsed consistent data in the Fashion Calendar publications enabled the implementation of search-optimization tools, data visualizations through graphs, and mapping of the hundreds of thousands of locations and spaces listed in the publication. In the development of the initial project phase, the project team used the Wikidata-controlled vocabulary to process and interlink named entities that existed in the data with their corresponding unique identifiers.

The scanned material and the parsed data is available to users, making it a valuable tool for pedagogical applications, including student research and its uses for digital/AI literacy education. Although the project design was actualized thanks to new technologies like AI and critical cataloging methodologies in data attributions, the project team will report on the challenges of transforming a text-based archive into a dynamic research database that enhances and transcends its original format.

The FCRD makes the entirety of the scanned material available freely, copyrighted material is available to users with a CC4.0 Attribution license. All the data and code developed for the project is available to users on Github and can be downloaded in JSON, directly from the database. This open-source resource is maintained by the FIT Library and exists on FIT servers, and additionally, as a content collection on the Internet Archive.

Appendix A

Bibliography
  1. Caron-Lizotte, Olivier (2022): “Description of AI Techniques for Calendar Project”, provided to the project PIs by Explor.ai for “The Ruth Finley Collection: Digitizing 70 Years of the Fashion Calendar.”
  2. Drabinski, Emily (2013): “Queering the Catalog: Queer Theory and the Politics of Correction”, in The Library Quarterly: Information, Community, Policy 83, 2: 94–111 < https://doi.org/10.1086/669547 > [29.05.2021].
  3. Drucker, Johanna (2011): “Humanities Approaches to Graphical Display”, in DHQ: Digital Humanities Quarterly , 5, 1, https://dhq-static.digitalhumanities.org/pdf/000091.pdf > [14.06.2024]
  4. Earhart, Amy E. (2023): “Feminist Digital Humanities”, in The Bloomsbury Handbook To the Digital Humanities . Edited by James O’Sullivan. London, UK: Bloomsbury Press: p. 75-82.
  5. Gairola, Rahul K. (2023): “Race, Otherness, and the Digital Humanities”, in The Bloomsbury Handbook To the Digital Humanities . Edited by James O’Sullivan. London, UK: Bloomsbury Press: p. 49-61.
  6. Risam, Roopika (2023): “Post-Colonial Digital Humanities Reconsidered”, in The Bloomsbury Handbook To the Digital Humanities . Edited by James O’Sullivan. London, UK: Bloomsbury Press: p.41-48.
Natalie Nudell (natalie_nudell@fitnyc.edu), The Fashion Institute of Technology, United States of America and Joseph Anderson (joseph_anderson@fitnyc.edu), The Fashion Institute of Technology, United States of America