Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

http://multiword.sourceforge.net/lawmwecxg2018

In Santa Fe, USA with COLING 18

Endorsed by SIGLEX.

Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

COLING 2018 (Santa Fe, USA), August 25-26, 2018
http://multiword.sourceforge.net/lawmwecxg2018

Organised, sponsored and endorsed by:

SIGLEX, the Special Interest Group on the Lexicon of the ACL SIGANN, teh Special Interest Group for Annotation

Also endorsed by:

SIGSEM, the Special Interest Group on Computational Semantics

This workshop addresses, within a joint event, three domains - linguistic annotation, multiword expressions and grammatical constructions - with partly overlapping communities and research interests, but relatively divergent practices and terminologies.

Linguistic annotation of natural language corpora is the backbone of supervised methods for statistical natural language processing. It also provides valuable data for evaluation of both rule-based and supervised systems and can help formalize and study linguistic phenomena. Challenges posed by creation/evaluation of annotation schemes, automatic and manual annotation, use and evaluation of annotation software and frameworks, or representation of linguistic data and annotations, have been addressed for the last decade within the Linguistic Annotation Workshop (LAW) organised yearly by the SIGANN.

The domain of multiword expressions (MWEs) is orthogonal to linguistic annotation since it addresses one particular linguistic phenomenon across various NLP modelling and processing layers or practices (including annotation). MWEs are word combinations, such as all of a sudden, a hot dog, to pay a visit or to pull one’s leg, which exhibit lexical, syntactic, semantic, pragmatic and/or statistical idiosyncracies. They encompass closely related linguistic objects such as idioms, compounds, light verb constructions, rhetorical figures, institutionalised phrases or collocations. Modelling and computational aspects of MWEs have been covered by the Multiword Expression Workshop, organised over the past years by the MWE section of SIGLEX. Due to their unpredictable behavior, and most prominently their non-compositional semantics, MWEs pose special problems in linguistic modelling (e.g. treebank annotation and grammar engineering), in NLP pipelines (e.g. when their orchestration with parsing is concerned), and in end-use applications (e.g. information extraction or machine translation).

These challenges are magnified when larger classes of idiosyncratic units are considered, namely grammatical constructions, i.e. conventional associations of lexical, syntactic, and pragmatic information, such as the-Adj-more-Adj (the more the merrier, the higher the better, etc.). In the framework of Construction Grammar (CxG), linguistic knowledge is captured in an inventory of form-meaning pairings of varying degrees of internal complexity and lexical fixedness. Thus, MWEs can be seen as special types of constructions: those in which constraints of a lexical nature are particularly strong. The potential new insights to be gained from bringing MWE and construction studies together are mutual. On the one hand, computational approaches to MWEs usually take binary decisions about units of language (MWE vs. non-MWE), i.e. the fact that MWEs occupy a “continuum of compositionality” is neglected. Construction-oriented modelling, conversely, paves the way towards a more nuanced representation of MWE idiosyncrasies. On the other hand, most grammatical constructions display considerable flexibility, therefore their discovery and description is a highly complex and labor-intensive process. This process might be largely facilitated if recent computational achievements for MWEs could be extended to constructions.

Annotation of grammatical constructions in training data could improve machine translation and information extraction, especially cross-lingually, as meanings that are similar across languages (like comparison) can be expressed in drastically different forms. However, annotation of constructions poses significant challenges: because constructions are form-meaning pairs that can be more or less fluid in form, determining the annotation units for a construction is not straightforward. As a result, strategies for choosing annotation units may vary greatly among annotators and projects depending on a range of factors, from practical concerns (intended use, processing constraints) to concerns imposed by an underlying theory. Annotation of grammatical constructions is therefore an area that offers rich opportunities for identifying principled annotation strategies, accommodating different perspectives on a given phenomenon, and finding ways to allow for harmonization of annotations not only from different sources, but also at different linguistic levels.

For the above reasons, grammatical constructions were elected as a joint focus of interest both by the MWE and the LAW community. We call for papers focusing on research related (but not limited) to the following topics.

Joint topics on constructions, annotation, and MWEs:

Annotation-specific topics

MWE-specific topics

SPECIAL TRACK: PARSEME Shared Task on Automatic Verbal MWE Identification

The LAW-MWE-CxG-2018 workshop hosts edition 1.1 of the PARSEME shared task on automatic verbal MWE identification (see below). This initiative is a follow-up of edition 1.0 in 2017, which attracted 7 systems working on 18 languages in total. In 2018, we extend the scope to new languages. A separate session will be allocated for the shared task track within the workshop, featuring presentations of the participating systems.

SUBMISSION MODALTIES

Note that we have extended the page limits for all papers by 1 page, to be consistent with the COLING policy.

Regular research track:

In regular research papers, the reported research should be substantially original. Papers available as preprints can also be submitted provided that they fulfil the conditions defined by the ACL Policies for Submission, Review and Citation.

Shared task track:

Shared task system description papers will go through a separate reviewing process. Submissions will be reviewed by the shared task organizers and participants. Participants of the shared task are not required to submit system description papers, and their acceptance depends on the quality of the paper rather than on the results obtained in the shared task.

Instructions for authors:

For all 3 types of papers, the submission is double-blind as per the COLING guidelines. There is no limit on the number of reference pages. Authors will be granted an extra page for the final version of their papers.

All papers will be presented orally or as posters, as determined by the Program Committee chairs. No distinction between papers presented orally or as posters is made in the workshop proceedings.

For all types of submission, the COLING 2018 LaTeX templates should be used. All papers should be submitted via the START space: https://www.softconf.com/coling2018/ws-LAW-MWE-CxG-2018/ Please choose the appropriate track (research/shared task) and submission modality (long/short).

IMPORTANT DATES

All deadlines are at 23:59 UTC-12 (anywhere in the world).

WORKSHOP ORGANIZERS

PROGRAM COMMITTEE CHAIRS

PUBLICATION CHAIRS

PUBLICITY CHAIRS

CONTACT

For any inquiries regarding the workshop please send an email to lawmwecxg2018@gmail.com

ANTI-HARASSMENT POLICY

The workshop supports the ACL anti-harassment policy.