Identifying Literary Novels with Simple Textual Features
The aim of the Riddle of Literary Quality is to study perceptions of literariness in a corpus of contemporary Dutch novels. Our experiments with machine learning models show that it is possible to automatically distinguish novels that are seen as highly literary from those that are seen as less literary, using surprisingly simple textual features. The most discriminating features of our classification model indicate that genre might be a confounding factor, but a regression model shows that we can also explain variation between highly literary novels from less literary ones within genre.
Andreas van Cranenburgh, Corina Koolen, and Kim Jautze are PhD candidates in the KNAW Computational Humanities project the Riddle of Literary Quality.
They combine literary theory and computational methods to analyze textual characteristics of recent Dutch novels