Learning Word Meaning from Non-Linguistic Data


In Edmonton, Canada with HLT/NAACL 2003

Endorsed by SIGLEX.

One of the grand challenges of NLP, AI, and Cognitive Science is to develop models of what words mean (lexical semantics) in terms of the non-linguistic world. Recently there has been growing interest in using corpus and data based techniques for this task. In other words, trying to learn what words mean by analysing a ‘parallel corpus’ of (A) non-linguistic data and (B) linguistic texts that describe or otherwise are based on the non-linguistic data. Recent examples of such work include learning verb semantics from visual-image sequences; learning the meaning of time phrases from a collection of weather forecasts based on numerical weather simulations; and learning the meaning of mathematical predicates from human Verbalisations of theorem-prover output.

We felt that while the enterprise of learning semantic information from conventional text-only corpora is well established, work on learning word meanings from Nonlinguistic data was being undertaken by researchers in many diverse fields. We needed a venue for these Researchers to meet, exchange ideas, and become familiar with each other’s work.