GitHub - nicolay-r/THOR-ECAC: The official fork of THoR Chain-of-Thought framework, enhanced and adapted for Emotion Cause Analysis (ECAC-2024)

THOR: Three-hop Reasoning for Emotion Cause Analysis in Context •

The PyTorch reforged and forked version of the official THoR-framework, enhanced and adapted for SemEval-2024 paper nicolay-r at SemEval-2024 Task 3: Using Flan-T5 for Reasoning Emotion Cause in Conversations with Chain-of-Thought on Emotion States

Update 06 March 2024: 🔓 attrdict represents the main limitation for code launching in Python 3.10 and hence been switched to addict (see Issue#2).

Update 05 March 2024: The quick arXiv paper breakdowns 🔨 are @ Twitter/X post

Update 17 February 2024: We support --bf16 mode for launching Flan-T5 with torch.bfloat16 type; this feature allows launching xl-sized model training with just a single NVidia-A100 (40GB)

NOTE: Since the existed fork aimed on a variety non-commercial projects application, this repository represent a copy of the originally published code with the folllowing 🔧 enhancements and changes

NOTE: List of the changes from the original THoR

Overview

Input: a conversation containing the speaker and the text of each utterance.
Output: all emotion-cause pairs, where each pair contains an emotion utterance along with its emotion category and the textual cause span in a specific cause utterance, e.g:
- (U3_Joy, U2_“You made up!”)

The complete description of the task is available here.

Framework illustration.

Quickstart

We provide a google-colab notebook for downloading all the necessary data, followed by launching experiments with NVidia-V100/ or NVidia-A100.

To form the Codalab Submission, please follow this section.

Usage

Requirement

This project has been tested under Python-3.8 and adapted for the Python-3.10.

Using pip, you can install the necessary dependencies as follows:

pip install -r requirements.txt

Datasets

👉 Compile datasets manually 👈

Serialize datasets: We provide download_data.py script for downloading and serialization of the manually compiled datasets (D_state and D_cause).

python download_data.py \
  --cause-test "https://www.dropbox.com/scl/fi/4b2ouqdhgifqy3pmopq08/cause-mult-test.csv?rlkey=tkw0p1e01vezrjbou6v7qh36a&dl=1" \
  --cause-train "https://www.dropbox.com/scl/fi/0tlkwbe5awcss2qmihglf/cause-mult-train.csv?rlkey=x9on1ogzn5kigx7c32waudi21&dl=1" \
  --cause-valid "https://www.dropbox.com/scl/fi/8zjng2uyghbkpbfcogj6o/cause-mult-valid.csv?rlkey=91dgg4ly7p23e3id2230lqsoi&dl=1" \
  --state-train "https://www.dropbox.com/scl/fi/0lokgaeo973wo82ig01hy/state-mult-train.csv?rlkey=tkt1oyo8kwgqs6gp79jn5vbh8&dl=1" \
  --state-valid "https://www.dropbox.com/scl/fi/eu4yuk8n61izygnfncnbo/state-mult-valid.csv?rlkey=tlg8rac4ofkbl9o4ipq6dtyos&dl=1"

For reproduction purposes you may refer to the code of this supplementary repository.

LLMs

Use the Flan-T5 as the backbone LLM reasoner:

NOTE: We setup base reasoner in config.yaml. However, it is highly recommended to choose the largest reasoning model you can afford (xl or higher) for fine-tuning.

Prompts and Engines

We provide separate engines, and for each engine the source of the prompts in particular:

prompt_state: instruction wrapped into the prompt
prompt_cause: instruction wrapped into the prompt
thor_state: Class of the prompts
thor_cause: Class of the prompts
thor_cause_rr: Class of the prompts same as thor_cause

Training and Evaluating with Flan-T5

Use the main.py script with command-line arguments to run the Flan-T5-based THOR system.

python main.py -c <CUDA_INDEX> \
    -r [prompt|thor_state|thor_cause|thor_cause_rr]  \ 
    -d [state_se24|cause_se24] \
    -lf "optional/path/to/the/pretrained/state" \
    -es <EPOCH_SIZE> \
    -bs <BATCH_SIZE> \
    -f <YAML_CONFIG>

Parameters list

-c, --cuda_index: Index of the GPU to use for computation (default: 0).
-d, --data_name: Name of the dataset. Choices are state_se24 or cause_se24.
-r, --reasoning: Specifies the reasoning mode, with one-step prompt or multi-step thor mode.
-li, --load_iter: load a state on specific index from the same data_name resource (default: -1, not applicable.)
-lp, --load_path: load a state on specific path.
-p, --instruct: instructive prompt for prompt training engine that involves target parameter only"
-es, --epoch_size: amount of training epochs (default: 1)
-bs, --batch_size: size of the batch (default: None)
-lr, --bert_lr: learning rate (default=2e-4)
-t, --temperature: temperature (default=gen_config.temperature)
-v, --validate: running under zero-shot mode on valid set.
-i, --infer_iter: running inference on test dataset to form answers.
-f, --config: Specifies the location of config.yaml file.

Configure more parameters in config.yaml file.

Submitting Results on Codalab

📊 Codalab Competiton Page

All the service that is not related to the Codalab is a part of another repository (link below 👇)

Once results were inferred (THOR-cause-rr results example), you may refer to the following code to form a submission:

👉 Codalab Service Repository 👈

References

The original THoR project:

@inproceedings{FeiAcl23THOR,
  title={Reasoning Implicit Sentiment with Chain-of-Thought Prompting},
  author={Hao, Fei and Bobo, Li and Qian, Liu and Lidong, Bing and Fei, Li and Tat-Seng, Chua},
  booktitle = "Proceedings of the Annual Meeting of the Association for Computational Linguistics",
  pages = "1171--1182",
  year={2023}
}

You can cite this work as follows:

@article{rusnachenko2024nicolayr,
  title={nicolay-r at SemEval-2024 Task 3: Using Flan-T5 for Reasoning Emotion Cause in Conversations with Chain-of-Thought on Emotion States},
  booktitle = "Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics",
  author={Nicolay Rusnachenko and Huizhi Liang},
  year= "2024",
  month= jun,
  address = "Mexico City, Mexico",
  publisher = "Association for Computational Linguistics"
}

Acknowledgement

This code is referred from following projects: CoT; Flan-T5; Transformers,

License

The code is released under Apache License 2.0 for Noncommercial use only.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
config		config
data		data
figures		figures
src		src
CHANGESET.md		CHANGESET.md
LICENSE.txt		LICENSE.txt
README.md		README.md
THoR_Finetuning_SemEval2023_t3_1_public.ipynb		THoR_Finetuning_SemEval2023_t3_1_public.ipynb
download_data.py		download_data.py
main.py		main.py
requirements.txt		requirements.txt

License

nicolay-r/THOR-ECAC

Folders and files

Latest commit

History

Repository files navigation

THOR: Three-hop Reasoning for Emotion Cause Analysis in Context •

Contents

Overview

Quickstart

Usage

Requirement

Datasets

👉 Compile datasets manually 👈

LLMs

Prompts and Engines

Training and Evaluating with Flan-T5

Parameters list

Submitting Results on Codalab

📊 Codalab Competiton Page

👉 Codalab Service Repository 👈

References

Acknowledgement

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages