New digital-humanities methods for paleography and handwritten manuscript analysis
In this presentation I will address digital and e-Science methods for the analysis of scans of handwritten or complicated machine-print manuscripts. Over the last few years we have developed two systems: Monk and GIWIS for word retrieval and identification of the hand, respectively. We will start with an introduction of the general philosophy behind the methods: What is expected of image quality including pre-processing steps, what are the limitations. We will do this chronologically, from the ‘ingest’ of a collection to the actual use. In the second part of the tutorial, we will zoom in on an interactive e-Science system for the training of word and character shapes by the Monk system. The Monk system currently deals with collections ranging from the dead sea scrolls to medieval manuscripts, and from 17th century captain’s logs to early 20th century administrative indices. Through a continuous interaction between human and the learning machine, word and character shape classes are constructed through agglomeration. By using several recognition methods in sequence, the principle of the Fahrkunst elevator can be used: the system is pushed to an increased performance by the stepwise shifts. I will show some recent results in mining characters in the Dead Sea Scrolls. The principle of stepwise ‘uplifting’ also takes place at the level of the refinement of transcriptions, which constitutes the latest addition to the Monk tools.
prof. dr. Lambert Schomaker
Artificial Intelligence & Cognitive Engineering (ALICE) University of Groningen The Netherlands