17th Workshop on Multiword Expressions (MWE 2021)
Colocated with ACL-IJCNLP 2021 (Bangkok, Thailand Online), 6 August 2021
Organised and sponsored by:
Special Interest Group on the Lexicon (SIGLEX) of the Association for Computational Linguistics (ACL)
Multiword expressions (MWEs) are word combinations which exhibit lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasies (Baldwin & Kim 2010), such as by and large, hot dog, pay a visit and pull one's leg. The notion encompasses closely related phenomena: idioms, compounds, light-verb constructions, rhetorical figures, institutionalised phrases, collocations, etc. The behaviour of MWEs is often unpredictable, in particular their meanings are not regularly composed of the meanings of their parts. Thus, MWEs are a major challenge in computational linguistics (Constant et al. 2017), including linguistic modelling (e.g. treebanking), computational modelling (e.g. parsing), and end-user NLP applications (e.g. natural language understanding, machine translation, and social media mining).
Modelling and processing MWEs for NLP has been the topic of the MWE workshop organised by the MWE section of SIGLEX in conjunction with major NLP conferences since 2003. Although much progress has been made in the field, MWE processing in end-user NLP tasks is currently under-explored, and most studies still introduce MWEs as future work. Nonetheless, there are recent studies in which MWEs gained particular attention in end-user applications, including machine translation (Zaninello & Birch 2020), text simplification (Kochmar et al. 2020, Liu & Hwa 2016), language learning and assessment (Paquot et al. 2019, Christiansen & Arnon 2017), social media mining (Maisto et al. 2017), and abusive language detection (Zampieri et al. 2020, Caselli et al. 2020).
The special focus for this 17th edition of the workshop is on MWE processing in end-user applications such as those listed above. On the one hand, the PARSEME shared tasks (Ramisch et al. 2020, Ramisch et al. 2018, Savary et al. 2017), among others, fostered significant progress in MWE identification, providing datasets, evaluation measures and tools that now allow fully integrating MWE identification into end-user applications. On the other hand, NLP seems to be shifting towards end-to-end neural models capable of solving complex end-user tasks with little or no intermediary linguistic symbols, questioning the extent to which MWEs should be implicitly or explicitly modelled. Therefore, one goal of this workshop is to bring together and encourage researchers in various NLP subfields to submit MWE-related research, so that approaches that deal with MWEs in various applications could benefit from each other.
Following the success of previous joint workshops LAW-MWE-CxG 2018, MWE-WN 2019 and MWE-LEX 2020, we further extend the scope of the workshop to MWEs in e-lexicons and WordNets, MWE annotation, as well as grammatical constructions.
The 17th Workshop on MWEs invites submissions on (but not limited to) the following topics:
Traditional MWE topics:
- Computationally-applicable theoretical work on MWEs and constructions in psycholinguistics and corpus linguistics
- MWE and construction annotation and representation in resources such as corpora, treebanks, e-lexicons and WordNets
- Processing of MWEs and constructions in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG, LFG, TAG, UD, etc.)
- Discovery and identification methods for MWEs and constructions
- MWEs and constructions in language acquisition, language learning, and non-standard language (e.g. tweets, speech)
- Evaluation of annotation and processing techniques for MWEs and constructions
- Retrospective comparative analyses from the PARSEME shared tasks on automatic identification of MWEs
Topics on MWEs and end-user applications:
- Processing of MWEs and constructions in end-user applications (e.g. MT, NLU, summarisation, social media mining, computer assisted language learning)
- Implicit and explicit representation of MWEs and constructions in end-user applications
- Evaluation of end-user applications concerning MWEs and constructions
- Resources and tools for MWEs and constructions (e.g. lexicons, identifiers) in end-user applications
Joint session with WOAH Workshop
Pursuing the MWE Section’s tradition of synergies with other communities and in accordance with ACL-IJCNLP 2021’s theme track on NLP for social good, we will organise a joint session with the Workshop on Online Abuse and Harm (WOAH). We believe that MWEs are important in online abuse detection, and that the latter can provide an interesting testbed for MWE processing technology. The main goal is to pave the way towards the creation of data for a shared task involving both communities. The format of the session is under discussion, and we welcome suggestions from the community. Submissions describing research on MWEs and abusive language, especially introducing new datasets, are also welcome.
Important dates
All deadlines are at 23:59 UTC-12 (anywhere in the world).
- April 19, 2021: Paper Submission Deadline
- April 26, 2021: EXTENDED Paper Submission Deadline
- May 3, 2021: RE-EXTENDED Paper Submission Deadline
- May 28, 2021: Notification of Acceptance
- June 7, 2021: Camera-ready papers due
- August 6, 2021: Workshop
Organizers
The MWE workshop is organized by the SIGLEX-MWE section.
Program committee
- Margarita Alonso-Ramos, Universidade da Coruña (Spain)
- Tim Baldwin, University of Melbourne (Australia)
- Verginica Barbu Mititelu, Romanian Academy (Romania)
- Fabienne Cap, Uppsala University (Sweden)
- Anastasia Christofidou, Academy of Athens (Greece)
- Ken Church, IBM Research (USA)
- Matthieu Constant, Université de Lorraine (France)
- Monika Czerepowicka, University of Warmia and Mazury (Poland)
- Myriam de Lhonneux, University of Copenhagen (Denmark)
- Gaël Dias, University of Caen Basse-Normandie (France)
- Meghdad Farahmand, University of Geneva (Switzerland)
- Christiane Fellbaum, Princeton University (USA)
- Joaquim Ferreira da Silva, New University of Lisbon (Portugal)
- Karën Fort, Sorbonne Université (France)
- Aggeliki Fotopoulou, ILSP/RC “Athena” (Greece)
- Marcos Garcia, University of Santiago de Compostela (Spain)
- Voula Giouli, Institute for Language and Speech Processing (Greece)
- Stefan Th. Gries, University of California (USA)
- Bruno Guillaume, Université de Lorraine (France)
- Chikara Hashimoto, Yahoo!Japan (Japan)
- Uxoa Iñurrieta, University of the Basque Country (Spain)
- Diptesh Kanojia, IIT Bombay (India)
- Elma Kerz, RWTH Aachen (Germany)
- Ekaterina Kochmar, University of Cambridge (UK)
- Dimitrios Kokkinakis, University of Gothenburg (Sweden)
- Ioannis Korkontzelos, Edge Hill University (UK)
- Cvetana Krstev, University of Belgrade (Serbia)
- Eric Laporte, University Paris-Est Marne-la-Vallee (France)
- Timm Lichte, University of Duesseldorf (Germany)
- Teresa Lynn, ADAPT Centre (Ireland)
- Stella Markantonatou, Institute for Language and Speech Processing (Greece)
- Yuji Matsumoto, Nara Institute of Science and Technology (Japan)
- Nurit Melnik, The Open University of Israel (Israel)
- Laura A. Michaelis, University of Colorado Boulder (USA)
- Johanna Monti, “L’Orientale” University of Naples (Italy)
- Preslav Nakov, Qatar Computing Research Institute, HBKU (Qatar)
- Malvina Nissim, University of Groningen (Netherlands)
- Diarmuid Ó Séaghdha, University of Cambridge (UK)
- Jan Odijk, University of Utrecht (Netherlands)
- Haris Papageorgiou, Institute for Language and Speech Processing (Greece)
- Marie-Sophie Pausé, independent researcher (France)
- Pavel Pecina, Charles University (Czech Republic)
- Ted Pedersen, University of Minnesota (USA)
- Scott Piao, Lancaster University (UK)
- Maciej Piasecki, Wroclaw University of Technology (Poland)
- Alain Polguère, Université de Lorraine (France)
- Matīss Rikters, University of Tokyo (Japan)
- Fatiha Sadat, Université du Québec à Montréal (Canada)
- Manfred Sailer, Goethe-Universität Frankfurt am Main (Germany)
- Magali Sanches Duran, University of São Paulo (Brazil)
- Branislava Šandrih, University of Belgrade (Serbia)
- Agata Savary, Université François Rabelais Tours (France)
- Sabine Schulte im Walde, University of Stuttgart (Germany)
- Matthew Shardlow, Manchester Metropolitan University (UK)
- Vered Shwartz, Allen AI (USA)
- Gyri Smørdal Losnegaard, University of Bergen (Norway)
- Ranka Stanković, University of Belgrade (Serbia)
- Ivelina Stoyanova, Bulgarian Academy of Sciences (Bulgaria)
- Stan Szpakowicz, University of Ottawa (Canada)
- Carole Tiberius, Dutch Language Institute (Netherlands)
- Beata Trawinski, Institut für Deutsche Sprache Mannheim (Germany)
- Ruben Urizar, University of the Basque Country (Spain)
- Aline Villavicencio, Federal University of Rio Grande do Sul (Brazil)
- Veronika Vincze, Hungarian Academy of Sciences (Hungary)
- Martin Volk, University of Zürich (Switzerland)
- Zeerak Waseem, University of Sheffield (UK)
- Eric Wehrli, University of Geneva (Switzerland)
- Seid Muhie Yimam, Universität Hamburg (Germany)
For any inquiries regarding the workshop please send an email to mweworkshop2021@gmail.com
Please register to SIGLEX and check the “MWE Section” box to be registered to our mailing list.
The workshop supports the ACL anti-harassment policy.