Netherlands eScience Centre

There is currently an important trend in the humanities to conduct research in a quantitative fashion, using large collections of data from different sources to support claims which would otherwise be difficult and time consuming to verify.

Software, Knowledge, Research

This data driven revolution requires humanities scholars to make use of computational techniques for integrating diverse kinds of data from multiple sources as well as for analyzing such large volumes of data.

For example, newspapers, social media, parliamentary transcriptions, or even entire libraries. The Netherlands eScience Center is actively involved in developing expertise in this area.

Our technical expertise includes optimized data handling, big data analytics, and efficient computing.

Contact

Dr. Jisk Attema
eScience Coordinator
Netherlands eScience Center

Relevant links

List of themes

  • Sentiment analysis – literary texts, reviews, twitter messages, etc. may convey opinions and emotions. Sentiment analysis is concerned with automatically extracting this information.

  • Topic modeling – When analyzing a text document, it seems natural to say “this document is about X” or to make statements such as “these two documents are about the same topic”. Topic modeling allows for this type of statements to be automated.

  • Word embedding – if words were mathematical vectors, we could use calculus to analyze and link texts and sentences

  • Named entity extraction – this refers to identifying the names of places, people, events, etc. in a text. This is not a trivial problem as language can be very ambiguous. One way of achieving this is by analyzing how often a particular combination of words refers to a specific concept.

  • Diachronic data analysis – concepts are not necessarily fixed in time and as such it is important to be able to analyze how they change over time.

  • Visualization – The growth of digital datasets in the humanities poses challenges for visualization, especially where the data lineage includes uncertainty at every step. We can no longer rely on spreadsheets and simple, one-dimensional graphs to capture the full complexity of our subject matter. NLeSC is dedicated to using the latest techniques in visualization and data exploration to tackle this issue.

  • Information retrieval – large data collections come with data curation, handling and storage challenges. Making the most of these datasets requires fast and accurate search engines and new data exploration techniques.