ESC: Redesigning WSD with Extractive Sense Comprehension

In ESC (Barba et al., 2021) we redesigned Word Sense Disambiguation (Navigli et al., 2009) as an Extractive Reading Comprehension task and achieved unprecedented performances on a number of different benchmarks and settings. In this repo we provide the code to reproduce the results of the paper along with the checkpoints for the best models.

How to Cite

@inproceedings{barba-etal-2021-esc,
    title = "{ESC}: Redesigning {WSD} with {E}xtractive {S}ense {C}omprehension",
    author = "Barba, Edoardo  and
      Pasini, Tommaso  and
      Navigli, Roberto",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2021.naacl-main.371",
    pages = "4661--4672",
    abstract = "Word Sense Disambiguation (WSD) is a historical NLP task aimed at linking words in contexts to discrete sense inventories and it is usually cast as a multi-label classification task. Recently, several neural approaches have employed sense definitions to better represent word meanings. Yet, these approaches do not observe the input sentence and the sense definition candidates all at once, thus potentially reducing the model performance and generalization power. We cope with this issue by reframing WSD as a span extraction problem {---} which we called Extractive Sense Comprehension (ESC) {---} and propose ESCHER, a transformer-based neural architecture for this new formulation. By means of an extensive array of experiments, we show that ESC unleashes the full potential of our model, leading it to outdo all of its competitors and to set a new state of the art on the English WSD task. In the few-shot scenario, ESCHER proves to exploit training data efficiently, attaining the same performance as its closest competitor while relying on almost three times fewer annotations. Furthermore, ESCHER can nimbly combine data annotated with senses from different lexical resources, achieving performances that were previously out of everyone{'}s reach. The model along with data is available at https://github.com/SapienzaNLP/esc.",
}

Environment Setup

To set up the python environment for this project, we strongly suggest using the bash script setup.sh that you can find at top level in this repo. This script will create a new conda environment and take care of all the requirements and the data needed for the project. Simply run on the command line:

bash ./setup.sh

and follow the instructions.

Checkpoints

These are the checkpoints of escher when trained on:

SemCor (SE07: 76.3 | ALL: 80.7)
SemCor & Oxford (Available upon request, SE07: 77.8 | ALL: 81.5)

Prediction and Evaluation

You can disambiguate a corpus using the script esc/predict.py:

PYTHONPATH=$(pwd) python esc/predict.py --ckpt <escher_checkpoint.ckpt> --dataset-paths data/WSD_Evaluation_Framework/Evaluation_Datasets/semeval2007/semeval2007.data.xml --prediction-types probabilistic

Where the dataset-paths that you provide to the model must be in a format that follows the one introduced by Raganato et al. (2017). For reference, all the datasets in the directory data/WSD_Evaluation_Framework follow this format. The predictions will be saved in the folder predictions with the name <dataset_name>_predictions.txt.

If you want to evaluate the model on a dataset, just add the parameter --evaluate on the previous command.

Training

If you want to train your own escher model you just have to run the following command:

PYTHONPATH=$(pwd) python esc/train.py --run_name fresh_escher_model --add_glosses_noise --train_path data/WSD_Evaluation_Framework/Training_Corpora/SemCor/semcor.data.xml

All the hyperparameters are set by default to the ones utilized in the paper. If you want to list them all just execute:

PYTHONPATH=$(pwd) python esc/train.py -h

To parse the hyperparameters in input we use argparse, so it is very simple to change them. For example to modify the learning rate to 0.0005 and the gradient accumulation steps to 10 you can execute the following command:

PYTHONPATH=$(pwd) python esc/train.py --learning_rate 0.0005 --gradient_acc_steps 10 --run_name fresh_escher_model --add_glosses_noise --train_path data/WSD_Evaluation_Framework/Training_Corpora/SemCor/semcor.data.xml

License

This project is released under the CC-BY-NC 4.0 license (see license.txt). If you use ESC, please put a link to this repo and cite the paper: ESC: Redesigning WSD with Extractive Sense Comprehension.

Acknowledgements

The authors gratefully acknowledge the support of the ERC Consolidator Grant MOUSSE No. 726487 under the European Union's Horizon 2020 research and innovation programme.

This work was supported in part by the MIUR under the grant "Dipartimenti di eccellenza 2018-2022" of the Department of Computer Science of the Sapienza University of Rome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ESC: Redesigning WSD with Extractive Sense Comprehension

How to Cite

Environment Setup

Checkpoints

Prediction and Evaluation

Training

License

Acknowledgements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
esc		esc
experiments		experiments
predictions		predictions
.gitignore		.gitignore
README.md		README.md
license.txt		license.txt
requirements.txt		requirements.txt
setup.sh		setup.sh

License

SapienzaNLP/esc

Folders and files

Latest commit

History

Repository files navigation

ESC: Redesigning WSD with Extractive Sense Comprehension

How to Cite

Environment Setup

Checkpoints

Prediction and Evaluation

Training

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages