WiTTFind in Data-Limbo. Not dead and yet no longer alive







Abstract

Since the summer of 2020, in a chain of unfortunate circumstances, the project group around WiTTFind, a search engine for the “Nachlass” (written heritage) of the philosopher Ludwig Wittgenstein, has broken up. Not much is left of the once diverse collaborative structure, leaving questions about the future of the data, applications and web presentation.

Several factors complicate this undertaking, including the problematic state of documentation, the heterogeneous code with many isolated solutions and the dependence on other institutions. All of this leads to difficult conditions for maintainability, let alone bringing started project parts to an end or even developing them further. There are, of course, many reasons for this situation. However, the most important of them is the organizational structure as an (almost) purely student project. Over the long project duration of more than 10 years, there were therefore quite a few project staff members. Some contributed to WiTTFind as part of their course work in one semester, a few stayed for years until they finished their studies.

Not only is it not possible for every research project to receive enough full paid positions to guarantee a secure project framework, but a project like this is also a unique and educational opportunity for the students involved to work on current interdisciplinary research. The fact that students are doing the project work does not devalue the project – the resulting tool has been and continues to be useful and important to the Wittgenstein scholarly community. In addition, WiTTFind research has multiple times been published in articles and presented at relevant conferences (e.g. Dhd conferences).

Relevant research was created that has a long-term right to exist – but questions arise about how to ensure that data from projects with a teaching focus are also handled sustainably.

The situation was exacerbated many times over by the fact that the project came to a sudden and unplanned end. Do research projects need to have a backup plan that can be called upon in an emergency to “save what can be saved”? Is the existence of such a plan, with all that it entails — from funding to time and effort — realistic?

In our talk we explain what the current situation of the project is and how it came about. But we also want to discuss what lessons can be learned from it, and what a future for WiTTFind could look like.

1. Introduction

Ludwig Wittgenstein, despite having published only one significant book during his lifetime, is regarded as one of the most influential philosophers of the 20th century. 1 Much of this acclaim stems from posthumous publications, compiled from his extensive "Nachlass," which includes manuscripts, typescripts, notebooks, diaries, and other materials. These writings provide invaluable insights into his philosophical development, yet accessing them has historically been limited by both availability and cost. The digitization and open access of the Wittgenstein Nachlass have removed significant barriers, allowing scholars and students alike to engage with Wittgenstein's work. However, access to these materials is not merely about availability; it is about usability.
WiTTFind, a search engine developed at the Centrum für Informations- und Sprachverarbeitung (CIS) at LMU Munich, emerged as a solution to this problem by making the Wittgenstein Nachlass not only accessible but also searchable. At project start it was not yet fully open-source, only 5.000 of the 20.000 pages were available. Through WiTTFind, users could sift through the philosopher’s vast body of work with ease, enabling research that would have been impossible without computational tools, such as word frequency analysis and contextual searches. Yet, despite its success and value to the scholarly community, WiTTFind now finds itself in a precarious position. A combination of factors —organizational, technical, and human— has pushed the project into a state of uncertainty. This paper explores the reasons behind WiTTFind’s current situation, the challenges that led to its decline, and the lessons that can be learned from its trajectory.
 

2. What is WiTTFind?

 
WiTTFind was initiated in 2010 as a collaboration between CIS and the Wittgenstein Archive at the University of Bergen (WAB), in cooperation with other academic institutions such as the University of Cambridge. The aim of the project was to create a specialized search engine that could handle the unique characteristics of Wittgenstein’s Nachlass. With over 20,000 pages of handwritten and typed material, the Nachlass presents a considerable challenge for traditional methods of research, requiring sophisticated computational tools to parse and analyze the vast amount of text.
WiTTFind introduced two significant innovations: symmetrical autosuggestion and a lexicon-based search. Together, these features greatly facilitated navigation of the intricate linguistic nuances characteristic of Wittgenstein's oeuvre. One example of the technical accomplishments of the project is the development of the Double-Sided Facsimile Reader, which enables users to view original manuscripts alongside transcriptions, thus providing a better context and readability. In recognition of its innovative contributions, WiTTFind received the EU Open Humanities Award in 2014 and later the Forschungspreis für exzellente Studierende from LMU. The project's work became even more relevant in 2017, when Wittgenstein's Nachlass was designated a UNESCO World Cultural Heritage. This underscored the global cultural importance of making these texts accessible and searchable.
The project's infrastructure was primarily based on linguistic and computational methods, including part-of-speech tagging, semantic annotation, and local grammars, in order to facilitate text searchability. The data pipeline, originating from raw XML data provided by WAB, was augmented with supplementary annotations prior to undergoing processing through a series of linguistic tools. This complex setup enabled WiTTFind to offer a diverse range of search options and computational analysis tools that made a substantial contribution to advancing Wittgenstein research.

3. Why Did WiTTFind End?

The sudden and untimely death of Dr. Maximilian Hadersbeck, the project’s founder, in the summer of 2020, marked a turning point for WiTTFind. Dr. Hadersbeck had been the driving force behind the project, providing leadership and technical expertise. With his passing, the project group, known as **WAST** (Wittgenstein Advanced Search Tools), began to disband, leaving WiTTFind in a vulnerable state. The lack of a clear successor or a structured transition plan compounded the issue, as no formal arrangements had been made for the project’s short-term or long-term future.
Several factors contributed to the disbandment of the project group. The student-driven nature of WiTTFind, while offering invaluable learning opportunities, proved to be a double-edged sword. Over the 10 years of the project’s existence, dozens of students contributed to the development of WiTTFind, many of them working on it as part of their course work. However, the transient nature of student involvement meant that the project suffered from a lack of continuity. Students would often work on WiTTFind for a semester or two, but their departure would leave critical pieces of the project unfinished or insufficiently documented. This, coupled with the absence of full-time, paid staff, left WiTTFind reliant on a constantly changing team with varying levels of expertise.
Complicating matters further was the interdisciplinary nature of the project. WiTTFind existed at the intersection of philosophy and computational linguistics, requiring close collaboration between researchers from both fields. However, this also led to tensions, as the research goals of the philosophy department and the technical goals of the computational team did not always align. While communication helped resolve many of these issues, the friction between different academic disciplines added another layer of complexity to an already challenging project structure.

4. The Challenges of a Student-Driven Project

WiTTFind’s reliance on student labor was both its strength and its weakness. On the one hand, the project provided students with hands-on experience in real-world research, allowing them to contribute meaningfully to a significant interdisciplinary endeavor. On the other hand, the high turnover of students, combined with the lack of a permanent staff, created substantial challenges for the project’s long-term sustainability. The codebase, which accumulated over a decade, became increasingly heterogeneous, as each cohort of students brought their own approaches and solutions. While the more experienced students, particularly the long-term student assistants, were able to manage the complexity by overseeing their own parts of the project, the constant flux of contributors inevitably led to issues of inconsistency and undocumented work.
One way the team mitigated these challenges was by maintaining two versions of WiTTFind: a “production” version and a “development” version. The former was a stable release that was kept free of experimental features, while the latter allowed students to test new ideas and tools. This separation helped ensure that the main site remained functional, even when student work introduced bugs or incomplete features. However, this approach was only a temporary solution to a more systemic issue: the absence of a cohesive, long-term development strategy.

5. Current State of WiTTFind: In Data-Limbo

As of 2023, WiTTFind remains online and functional, but it exists in a precarious state. The search engine is now hosted on a server provided by ITG/LRZ, and its maintenance falls on the shoulders of two former WAST members, both of whom are working on the project in their free time. While the site is operational, its future is uncertain, primarily due to several technical and organizational factors that have yet to be addressed.
The codebase, accumulated over more than a decade, is highly heterogeneous, with contributions from over 10 different student developers. Many of the libraries and dependencies used in the project are now outdated, and there is no regular process for updating or maintaining them. Additionally, the state of documentation is questionable, as much of the work done by students was never fully documented, leaving current maintainers to navigate a complex and fragmented system.
Moreover, the collaborations that once supported the project, particularly with WAB and other institutions, have largely ceased. The dialogue with project partners has diminished, and questions surrounding the legal aspects of data licensing and the future of WiTTFind’s cooperation with WAB remain unresolved. While the project continues to serve its core user base —researchers with no alternatives for accessing Wittgenstein’s Nachlass— the lack of a clear long-term plan raises concerns about how much longer it can be sustained.

6. Lessons Learned and Preventive Measures

WiTTFind’s current situation highlights the need for clear organizational structures and long-term planning in interdisciplinary research projects, especially those driven by student labor. Several preventive measures could have been put in place to avoid the project’s decline. Firstly, the project could have benefited from having a designated substitution structure, at least someone who could step in to manage the project in the absence of the main leader. Additionally, the project lacked centralized, accessible documentation that would have allowed new students and external collaborators to quickly understand the structure and workflow of WiTTFind. This exists for part of the code, but specifically the inner workings of how to run the servers, deployment pipeline for webservices and the data on various databases was missing.
An emergency plan, including documented protocols for handling key technical and organizational aspects of the project, would have been invaluable. Such a plan could have outlined procedures for maintaining the site, updating code dependencies, and ensuring continued communication with project partners. Furthermore, establishing a long-term structural plan—one that defined clear roles, goals, and a sustainable project hierarchy involving long-term collaboration partners—could have provided the stability needed to sustain the project without relying solely on short-term student contributions.

7. Conclusion

WiTTFind’s journey from a vibrant, interdisciplinary project to its current state of uncertainty offers lessons for similar DH-projects. While the project succeeded in creating a valuable tool for Wittgenstein scholars, its reliance one person holding it together and lack of emergency planning left it vulnerable. As the digital humanities field continues to grow, projects like WiTTFind must find ways to balance educational goals with the need for sustainability. By implementing clear leadership structures, creating robust documentation, and planning for long-term maintenance, future projects can avoid the pitfalls that led WiTTFind into data-limbo.

Bibliography

  • Hadersbeck/Pichler/Ullrich/Röhrer/Gangopadhyay 2018 = Hadersbeck, Maximilian / Pichler, Alois / Ullrich, Sabine / Röhrer, Ines / Gangopadhyay, Nivedita (2018): The FinderApp WiTTFind for Wittgenstein’s Nachlass, in: Archives of Data Science, Series A (Online First), vol. 5, 1, A14, 19 S. online (Link).

Leave a Reply