One of the grand challenges of NLP, AI, and Cognitive Science is to develop models of what words mean (lexical semantics) in terms of the non-linguistic world. Recently there has been growing interest in using corpus and data based techniques for this task. In other words, trying to learn what words mean by analysing a ‘parallel corpus’ of (A) non-linguistic data and (B) linguistic texts that describe or otherwise are based on the non-linguistic data. Recent examples of such work include learning verb semantics from visual-image sequences; learning the meaning of time phrases from a collection of weather forecasts based on numerical weather simulations; and learning the meaning of mathematical predicates from human Verbalisations of theorem-prover output.
We felt that while the enterprise of learning semantic information from conventional text-only corpora is well established, work on learning word meanings from Nonlinguistic data was being undertaken by researchers in many diverse fields. We needed a venue for these Researchers to meet, exchange ideas, and become familiar with each other’s work.