Skip to content

preyero/icwsm22-supporting-toxicity-with-KG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Supporting Online Toxicity Detection with Knowledge Graphs

Reference to the repository: DOI

Authors: Paula Reyero Lobo ([email protected]), Enrico Daga ([email protected]), Harith Alani ([email protected])

This repository supports the paper "Supporting Online Toxicity Detection with Knowledge Graphs" (link to paper) presented at ICWSM 2022. In this work, we deal with the problem of annotating toxic speech corpora and use semantic knowledge about gender and sexual orientation to identify missing target information about these groups. The workflow followed for this experiment is presented below:

dependency graph

The resulting output of this code corresponds to the directory tree bellow. We release these files in the following open repository:

icwsm22-supporting-toxicity-with-KG
│   readme.md  
└───data
│   │   all_data_splits.csv
│   │
│   └───gsso_annotations
│       │   file11.csv
│   └───gsso_annotations_inferred
│       │   file21.csv
│   │   identity_data_splits.csv
│   │   readme.md
└───results
│   └───1_freq_tables
│   └───2_freq_plots
│   └───3_freq_plots_category
│   └───4_candidate_scores
│   └───saved_dict
└───scripts

To set up the project using a virtual environment:

    $ python -m venv <env_name>
    $ source <env_name>/bin/activate
    (<env_name>) $ python -m pip install -r requirements.txt

Example usage:

Using the command line from project folder to detect gender and sexual orientation entities in the text:

    (<env_name>) $ python scripts/gsso_annotate.py