ACL SIGLEX - Events - Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

COLING 2018 (Santa Fe, USA), August 25-26, 2018
http://multiword.sourceforge.net/lawmwecxg2018

Organised, sponsored and endorsed by:

SIGLEX, the Special Interest Group on the Lexicon of the ACL SIGANN, teh Special Interest Group for Annotation

Also endorsed by:

SIGSEM, the Special Interest Group on Computational Semantics

This workshop addresses, within a joint event, three domains - linguistic annotation, multiword expressions and grammatical constructions - with partly overlapping communities and research interests, but relatively divergent practices and terminologies.

Linguistic annotation of natural language corpora is the backbone of supervised methods for statistical natural language processing. It also provides valuable data for evaluation of both rule-based and supervised systems and can help formalize and study linguistic phenomena. Challenges posed by creation/evaluation of annotation schemes, automatic and manual annotation, use and evaluation of annotation software and frameworks, or representation of linguistic data and annotations, have been addressed for the last decade within the Linguistic Annotation Workshop (LAW) organised yearly by the SIGANN.

The domain of multiword expressions (MWEs) is orthogonal to linguistic annotation since it addresses one particular linguistic phenomenon across various NLP modelling and processing layers or practices (including annotation). MWEs are word combinations, such as all of a sudden, a hot dog, to pay a visit or to pull one’s leg, which exhibit lexical, syntactic, semantic, pragmatic and/or statistical idiosyncracies. They encompass closely related linguistic objects such as idioms, compounds, light verb constructions, rhetorical figures, institutionalised phrases or collocations. Modelling and computational aspects of MWEs have been covered by the Multiword Expression Workshop, organised over the past years by the MWE section of SIGLEX. Due to their unpredictable behavior, and most prominently their non-compositional semantics, MWEs pose special problems in linguistic modelling (e.g. treebank annotation and grammar engineering), in NLP pipelines (e.g. when their orchestration with parsing is concerned), and in end-use applications (e.g. information extraction or machine translation).

These challenges are magnified when larger classes of idiosyncratic units are considered, namely grammatical constructions, i.e. conventional associations of lexical, syntactic, and pragmatic information, such as the-Adj-more-Adj (the more the merrier, the higher the better, etc.). In the framework of Construction Grammar (CxG), linguistic knowledge is captured in an inventory of form-meaning pairings of varying degrees of internal complexity and lexical fixedness. Thus, MWEs can be seen as special types of constructions: those in which constraints of a lexical nature are particularly strong. The potential new insights to be gained from bringing MWE and construction studies together are mutual. On the one hand, computational approaches to MWEs usually take binary decisions about units of language (MWE vs. non-MWE), i.e. the fact that MWEs occupy a “continuum of compositionality” is neglected. Construction-oriented modelling, conversely, paves the way towards a more nuanced representation of MWE idiosyncrasies. On the other hand, most grammatical constructions display considerable flexibility, therefore their discovery and description is a highly complex and labor-intensive process. This process might be largely facilitated if recent computational achievements for MWEs could be extended to constructions.

Annotation of grammatical constructions in training data could improve machine translation and information extraction, especially cross-lingually, as meanings that are similar across languages (like comparison) can be expressed in drastically different forms. However, annotation of constructions poses significant challenges: because constructions are form-meaning pairs that can be more or less fluid in form, determining the annotation units for a construction is not straightforward. As a result, strategies for choosing annotation units may vary greatly among annotators and projects depending on a range of factors, from practical concerns (intended use, processing constraints) to concerns imposed by an underlying theory. Annotation of grammatical constructions is therefore an area that offers rich opportunities for identifying principled annotation strategies, accommodating different perspectives on a given phenomenon, and finding ways to allow for harmonization of annotations not only from different sources, but also at different linguistic levels.

For the above reasons, grammatical constructions were elected as a joint focus of interest both by the MWE and the LAW community. We call for papers focusing on research related (but not limited) to the following topics.

Joint topics on constructions, annotation, and MWEs:

MWE and construction annotation in corpora and treebanks
MWE and construction representation in manually and automatically constructed lexical resources
Extending MWE discovery and identification methods to constructions
MWEs and constructions (and their annotations) in language acquisition and in non-standard language (e.g. tweets, forums, spontaneous speech)
Evaluation of MWE and construction annotation and processing techniques
Computationally-applicable theoretical studies on MWEs and constructions in psycholinguistics, corpus linguistics and grammar formalisms, and/or how such studies can impact annotation of constructions

Annotation-specific topics

Annotation procedures, whether manual or automatic, including machine learning and knowledge-based methods
Maintenance and interactive exploration of annotation structures and annotated data
Qualitative and quantitative annotation evaluation
Linguistic considerations, representation formats and exploration tools for merged annotations of different phenomena
Standards, best practices, documentation, interoperability, and comparison of annotation schemes
Development, evaluation and innovative use of annotation software frameworks

MWE-specific topics

Original MWE discovery and identification methods
MWE processing in syntactic and semantic frameworks (e.g. HPSG, LFG, TAG, universal dependencies, WSD, semantic parsing), and in end-user applications (e.g. summarization, machine translation)

SPECIAL TRACK: PARSEME Shared Task on Automatic Verbal MWE Identification

The LAW-MWE-CxG-2018 workshop hosts edition 1.1 of the PARSEME shared task on automatic verbal MWE identification (see below). This initiative is a follow-up of edition 1.0 in 2017, which attracted 7 systems working on 18 languages in total. In 2018, we extend the scope to new languages. A separate session will be allocated for the shared task track within the workshop, featuring presentations of the participating systems.

SUBMISSION MODALTIES

Note that we have extended the page limits for all papers by 1 page, to be consistent with the COLING policy.

Regular research track:

Long papers (9 content pages + references): They should report on solid and finished research including new experimental results, resources and/or techniques.
Short papers (5 content pages + references): They should report on small experiments, focused contributions, ongoing research, negative results and/or philosophical discussion.

In regular research papers, the reported research should be substantially original. Papers available as preprints can also be submitted provided that they fulfil the conditions defined by the ACL Policies for Submission, Review and Citation.

Shared task track:

System description papers (5 content pages + references): These papers should briefly describe the approach implemented to solve the problem. They may include references and links to more detailed descriptions in other documents.

Shared task system description papers will go through a separate reviewing process. Submissions will be reviewed by the shared task organizers and participants. Participants of the shared task are not required to submit system description papers, and their acceptance depends on the quality of the paper rather than on the results obtained in the shared task.

Instructions for authors:

For all 3 types of papers, the submission is double-blind as per the COLING guidelines. There is no limit on the number of reference pages. Authors will be granted an extra page for the final version of their papers.

All papers will be presented orally or as posters, as determined by the Program Committee chairs. No distinction between papers presented orally or as posters is made in the workshop proceedings.

For all types of submission, the COLING 2018 LaTeX templates should be used. All papers should be submitted via the START space: https://www.softconf.com/coling2018/ws-LAW-MWE-CxG-2018/ Please choose the appropriate track (research/shared task) and submission modality (long/short).

IMPORTANT DATES

All deadlines are at 23:59 UTC-12 (anywhere in the world).

May 25 27, 2018 (extended) submission deadline
June 20, 2018 Notification of acceptance
June 30, 2018 Camera-ready papers due
August 25-26, 2018 LAW-MWE-CxG 2018 Workshop

WORKSHOP ORGANIZERS

Nancy Ide, Vassar College (USA)
Adam Meyers, New York University (USA)
Carlos Ramisch, Aix Marseille University (France)
Agata Savary, Université François Rabelais Tours (France)

PROGRAM COMMITTEE CHAIRS

Jena Hwang, Institute for Human and Machine Cognition (USA)
Miriam R L Petruck, ICSI (USA)
Sameer Pradhan, cemantix.org and Vassar College, New York (USA)
Carlos Ramisch, Aix Marseille University (France)
Agata Savary, Université François Rabelais Tours (France)
Nathan Schneider, Georgetown University (USA)

PUBLICATION CHAIRS

Melanie Andresen, Hamburg University (Germany)
Agata Savary, Université François Rabelais Tours (France)

PUBLICITY CHAIRS

Adam Meyers, New York University (USA)
Agata Savary, Université François Rabelais Tours (France)

CONTACT

For any inquiries regarding the workshop please send an email to lawmwecxg2018@gmail.com

ANTI-HARASSMENT POLICY

The workshop supports the ACL anti-harassment policy.

Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

http://multiword.sourceforge.net/lawmwecxg2018

In Santa Fe, USA with COLING 18