The Concept Web, Data Publishing, and Automated Knowledge Discovery, New Trends in eHumanities

Erik Schultes, Leiden University Medical Center


May 12, 2011

The Concept Web, Data Publishing, and Automated Knowledge Discovery
High data volume enterprises like genomics and astrophysics challenge traditional modes of scholarship (i.e., reading papers, formulating hypotheses and publishing new data). I will review our resent efforts to develop a variety of computational technologies for managing the deluge of information in the life sciences. In particular, I will describe how we used ‘concept profiles’ to text mine the biomedical literature and discover previously unknown protein-protein interactions. I will also describe our recently completed pilot study (along with Nature Publishing Group and Thomson Reuters) in data-publishing using an RDF framework called ‘nanopublications’ ( Nanopublications are designed to disseminate, preserve and make interoperable very large scale datasets.

Dr. Schultes studied evolutionary biology, informatics and biotechnology at the University of California Los Angeles and the Whitehead Institute for Biomedical Research and has held appointments at the Santa Fe Institute. Between 2007 and 2009, Erik was Adjunct Professor (Computer Science) and member of the Visualization Technology Group at Duke University. He is presently Research Scientist at Leiden University Medical Center (Human Genetics) and member of the Concept Web Alliance, a consortium of public and private organizations developing Web 3.0 technology for high-volume information management. Background article: ‘The value of data’ published in Nature Genetics, 43, 281-283 (2011); open access article available here.