In order to secure the large amount of data gained from more than 100,000 questionnaires, all the copies received up until 2015 have been scanned and archived. Around one terabyte of image data is now digitally accessible.
Thus the editors of the Bavarian Dictionary are now be aided in their daily work of excerption by a new and specifically tailored application: since April 2016 all relevant dialect data can be recorded and lemmatised using a fully web-based editing tool, which also makes the collected and edited material online available to the public.
LexHelfer - a Tailored Editing Tool
Currently this tool contains more than 450,000 single pages from 100,000 questionnaires. Semantic file names allow assignment of each scan to a specific reference person. The semi-automatic recording of individual questions (snippets) has allowed the creation of more than 6.5 million snippets within a very short time, with sparing use of the human resources. The snippets are found via search options and can be directly viewed or edited.
Since the completion of the editing tool’s first work version in the Spring of 2016, about 3,800 lemmas and more than 36,000 individual verifications have been produced (as of to November 2016), each one unambiguously assigning a lemma to the corresponding usage example in the questionnaire. The semantic and relational structure of all data provides much faster access to the original material as well as quicker lemmatisation.
Search in the Database
In order to narrow the search down effectively, the tool offers a number of search functions. It is possible to search for a questionnaire and for the number of a question as well as for a region, a district or a place. It is of great use to the scientific community as well as for the public to be able to search for an already linked lemma. Other options such as sorting or backgrounding of already edited snippets are primarily important for the editors. The current editing of relevant snippets can be shown and sorted alphabetically according to the names of settlements (see figures 1 and 2).
Search results consist of snippets accompanied by file name and assigned lemmas as well as by options for editing the corresponding search result/snippet.
Zu den Publikationen des Projekts gehört seit 2020 auch das digitale Stichwortverzeichnis zum Bayerischen Wörterbuch (siehe Publikationen). Es präsentiert sämtliche im Wörterbuch erfassten Lemmata alphabetisch in Index-Form. Dabei ist das Verzeichnis als relationale Datenbank angelegt und erlaubt einfache wir kombinierte Suchanfragen. Auch unscharfe Suchen und die Recherche mit Platzhaltern ist möglich, um Ergebnisse bei der Eingabe regional variierender Mundartformen zu erhalten. Treffer werden in Listenform präsentiert und bilden die zugehörigen Stichwörter ab. Dazu werden Stellenangaben verzeichnet, die das Nachschlagen im gedruckten Band ermöglichen, aber auch unmittelbar auf die digital hinterlegten Passagen in der online-Version des Bayerischen Wörterbuchs verlinken.
The original questionnaires, some more than five decades old and now very brittle, are now permanently secured. Scanning has permitted electronic safeguarding of the material on various data media, among them special archive servers of the Leibnitz Supercomputing Centre (LRZ) in Munich. Losses can now be ruled out. Application of long-established IT standards facilitates the reusability of all data as well as regular backups. In this fashion, the data recorded today will be available many decades to come.
Filling in the Word Lists Online
As from 2017, the new Word Lists can be filled in online.