Universal Probing Framework

This probing Framework provides a full pipeline for probing experiments, i. e. experiments for interpretation of large language models. In a nutshell, Probing Framework supports:

automatic generation of probing tasks on the basis of Universal Dependencies annotation;
generation of probing tasks based on manual queries to data in the CONLL-U format;
basic probing experiments with several classifers, such as Logistic Regression and Multilayer Perceptron;
other probing methods, such as Minimum Description Length (MDL);
baselines for probing experiments, such as label shuffling;
different metrics, including standard ones (such as F1-score and accuracy) and selectivity (the difference between experiments and control tasks);
visualisation and aggregation tools for further analysis of experiments.

Getting started

Clone the repository with code:

git clone https://github.com/AIRI-Institute/Probing_framework
cd Probing_framework/

Install requirements and appropriate torch version:

bash cuda_install_requirements.sh

Install all other necessary packages:

pip install -r requirements.txt

More details and examples

Section	Description
About probing	General information about probing
About framework	General information about this framework
Web interface	Information about visualization part
How to use	Information with usage examples

How to cite

@inproceedings{serikov-etal-2022-universal,
    title = "Universal and Independent: Multilingual Probing Framework for Exhaustive Model Interpretation and Evaluation",
    author = "Serikov, Oleg  and
      Protasov, Vitaly  and
      Voloshina, Ekaterina  and
      Knyazkova, Viktoria  and
      Shavrina, Tatiana",
    booktitle = "Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, United Arab Emirates (Hybrid)",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.blackboxnlp-1.37",
    pages = "441--456",
    abstract = "Linguistic analysis of language models is one of the ways to explain and describe their reasoning, weaknesses, and limitations. In the probing part of the model interpretability research, studies concern individual languages as well as individual linguistic structures. The question arises: are the detected regularities linguistically coherent, or on the contrary, do they dissonate at the typological scale? Moreover, the majority of studies address the inherent set of languages and linguistic structures, leaving the actual typological diversity knowledge out of scope.In this paper, we present and apply the GUI-assisted framework allowing us to easily probe massive amounts of languages for all the morphosyntactic features present in the Universal Dependencies data. We show that reflecting the anglo-centric trend in NLP over the past years, most of the regularities revealed in the mBERT model are typical for the western-European languages. Our framework can be integrated with the existing probing toolboxes, model cards, and leaderboards, allowing practitioners to use and share their familiar probing methods to interpret multilingual models.Thus we propose a toolkit to systematize the multilingual flaws in multilingual models, providing a reproducible experimental setup for 104 languages and 80 morphosyntactic features.",
}

Name		Name	Last commit message	Last commit date
Latest commit History 497 Commits
.github/workflows		.github/workflows
data		data
docs		docs
img		img
probing		probing
scripts		scripts
tests		tests
.gitignore		.gitignore
.isort.cfg		.isort.cfg
README.md		README.md
cuda_install_requirements.sh		cuda_install_requirements.sh
mypy.ini		mypy.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

data

data

docs

docs

img

img

probing

probing

scripts

scripts

tests

tests

.gitignore

.gitignore

.isort.cfg

.isort.cfg

README.md

README.md

cuda_install_requirements.sh

cuda_install_requirements.sh

mypy.ini

mypy.ini

requirements.txt

requirements.txt

Repository files navigation

Universal Probing Framework

Getting started

More details and examples

How to cite

About

Releases

Packages

Languages

bigscience-workshop/massive-probing-framework

Folders and files

Latest commit

History

Repository files navigation

Universal Probing Framework

Getting started

More details and examples

How to cite

About

Resources

Stars

Watchers

Forks

Languages