Agenda:
13.30-13.45         Welcome and introduction to the CH Symposium
13.45-14.30         Tunes and Tales
14.30-15.15         Riddle of Literary Quality
15.15-15.30         Tea
15.30-16.15         CEDAR
16.15-16.45         Elite Network Shifts
16.45-17.30         Panel discussion

Abstracts:

Tunes and Tales

Tunes & Tales: Modeling oral transmission
Oral transmission is a fascinating aspect of the broader phenomenon of cultural transmission. In oral culture, artifacts such as songs and stories are passed on to next generations without written or technical reproduction media, just by voice and ear. Oral transmission implies alteration and variation to a considerable extent. Yet after several generations of oral transmission the artifacts are still ‘the same’ (in oral terms), or at least recognizable variants (from a literate point of view). How can this be? Are there convergent forces? How can we model the process of oral transmission?

We hypothesize that oral transmission of tales and tunes happens through the replication of sequences of motifs. In this view, motifs constitute the primary vehicles of cultural heritage in oral transmission of both artifacts. A prerequisite for building such a motif-based model of oral transmission is to formalize tunes and tales as sequences of motifs. In this presentation we will discuss the first steps taken to come to these formalizations.

(Presentation slides)

The Riddle of Literary Quality

Beauty or Power? The Riddle of Literary Quality
Most literary scholars now agree on the fact that literary quality is bestowed on works by publishers, critics, and other actors in the literary field (‘Power’). The idea that the literary works themselves (‘Beauty’) may also play a role is frowned upon. Still, the Beauty-side of the quarrel will be empirically tested in the project The Riddle of Literary Quality. We assume that formal characteristics of a text may also be of importance in calling a fictional text literary or non-literary, and good or bad – non-literary texts can also be good and literary text can also be bad. Many formal characteristics can be thought of as having a part in this, e.g. the use of difficult words, the number of adjectives and adverbs, or complex syntactic style. The project explores this assumption, integrating the analysis of low-level lexical-statistical features and high-level syntactic and narrative features.

One of the first objectives is a large survey of readers’ opinions on 400 recent works of fictions published in Dutch. The role of the survey will be explained in the presentation. Next, the first results of high-level pattern recognition will be shown. We present a method to analyze the style of an author through syntactic patterns. We present a system that is able to guess the author of a text by comparing recurring grammatical constructions and expressions occurring in the phrase-structures of texts. This use of high-level syntactic patterns is in contrast to the usual methods for stylometry which focus on the frequencies of more superficial features such as word frequencies. The high-level syntactic patterns that we find provide better opportunities for interpretation from a digital humanities perspective and are the first step towards a syntactical comparison of style.

(Presentation slides Andreas van Cranenburgh)

(Presentation slides Karina van Dalen)

CEDAR

From the roots of data to the leaves of social history: semantic technology for Historic Dutch Census (CEDAR)
Learning from (social-economic) history helps to understand the inter-relation between macro-economic change and individual lifestyles, policy regimes, labour markets, communities and national wealth. However, sources of historical information about the lives of individuals, communities, and nations are still scattered.

This project takes Dutch census data as its starting point to build a semantic data-web of historical information. With such a web, it will be possible to answer questions such as: What kind of patterns can be identified and interpreted as expressions of regional identity? How can patterns of changes in skills and labour be related to technological progress and patterns of geographical migration? How can changes of local and national policies in the structure of communities and individual lives be traced?

Census data alone are not sufficient to answer these questions. This project applies a specific web-based data-model – exploiting the Resource Description Framework (RDF) technology– to make census data inter-linkable with other hubs of historical socio-economic and demographic data and beyond. Pattern recognition appears on two levels: first to enable the integration of hitherto isolated datasets, and second to apply integrated querying and analysis across this new, enriched information space. Data analysis interfaces, visual inventories of historical data and reports on open-linked data strategies for digital collections will be some of the results of this project. The project will also produce generic methods and tools to weave historical and socio-economic datasets into an interlinked semantic data-web.

Elite Network Shifts

Apa kabar? Extracting sociological data from masses of Indonesian newspaper clippings.
Indonesia is the fourth largest nation in the world. It has a chaotic bureaucracy but a vibrant press. What if we could read its dozens of digital/ digitised newspapers automatically and extract vital sociological trends? Elite Network Shifts will develop techniques for doing this. It will focus on the problem of elite rotation during regime change.
(Presentation slides)