Information on models fine-tuning used in OmniEvent.infer #50

archatelain · 2024-01-03T17:58:09Z

Hello,

Thank you for this great package!

I would like to know on which datasets and how the two models that are used when running OmniEvent.infer were fine-tuned. That is, the 2 models which links are accessibles in the utils module.

In particular, I did notice that there is an option "schema" in OmniEvent.infer. I took it as suggesting that the models where fine-tuned all on the schemas available. Yet, when digging a bit further I noticed that none of these schemas have been passed as special_tokens to the tokenizer. Thus I'm wondering how the model would know that we are refering to a specific task, that is the fine-tuning on a specific dataset, when prepending each text with f"<txt_schema>". To be sure, when given "<maven>The king married the queen" how does the model understand that I want it to focus on what it learned when being fine-tuned on the maven dataset?

I ran a test only with the EDProcessor class using the schema "maven" and indeed it treated it as any other token.

Thank you

The text was updated successfully, but these errors were encountered:

h-peng17 · 2024-01-20T04:44:41Z

Hello, that's a good question. The models we release are trained on multiple EE datasets. When training on different datasets, we add a prefix to represent the schema of the data. For example, we use "" to represent the schema of the Maven dataset. However, due to limitations in data volume and model capacity, the models we release sometimes struggle to follow human instructions (i.e., the schema prefix).
We are currently researching how to align the model better for IE tasks to make it more adept at following human instructions.

archatelain · 2024-01-22T09:12:20Z

Thank you for your answer.

I'm still unclear though on these prefixes. It seems like you did not add them as special tokens in the tokenizer. Is it that you considered that treating them like any other word was not a problem or am I missing something?

For instance in the following google colab notebook, the author does add their prefix "<idf.lang>" as a special token to the tokenizer: https://colab.research.google.com/github/KrishnanJothi/MT5_Language_identification_NLP/blob/main/MT5_fine-tuning.ipynb

h-peng17 · 2024-02-28T07:24:52Z

Thanks for the question. We trained two versions of the model: with prefixes added as special tokens or without. There is no significant difference between the results of these two. Previous work has also revealed the similar phenomenon (https://aclanthology.org/2022.aacl-short.21.pdf).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Information on models fine-tuning used in OmniEvent.infer #50

Information on models fine-tuning used in OmniEvent.infer #50

archatelain commented Jan 3, 2024

h-peng17 commented Jan 20, 2024

archatelain commented Jan 22, 2024

h-peng17 commented Feb 28, 2024

Information on models fine-tuning used in OmniEvent.infer #50

Information on models fine-tuning used in OmniEvent.infer #50

Comments

archatelain commented Jan 3, 2024

h-peng17 commented Jan 20, 2024

archatelain commented Jan 22, 2024

h-peng17 commented Feb 28, 2024