Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paraphrase prompts' format not compatible with the sample from ROME paper #11

Open
HichemAK opened this issue Sep 13, 2023 · 3 comments

Comments

@HichemAK
Copy link

When reading the ROME paper, I found this sample from CounterFact:

{
"case_id": 1067,
"pararel_idx": 21136,
"requested_rewrite": {
"prompt": "The headquarters of {} is in",
"entity": "Gazi University",
"relation_id": "P159",
"target_new": {"str": "Glasgow", "id": "Q4093"},
"target_true": {"str": "Ankara", "id": "Q3640"},
},
"paraphrase_prompts": [
"The headquarter of Gazi University is located in",
"Gazi University is headquartered in",
],
"neighborhood_prompts": [
"The headquarter of TRT Haber is located in",
"Agricultural Bank is headquartered in",
"TRT Avaz is based in",
"AnadoluJet's headquarters are in",
"The headquarters of National Intelligence Organization is in",
"The headquarter of MKE Ankaragücü is in",
"The headquarters of Agricultural Bank is in",
"The headquarter of Turkish Red Crescent is located in",
"Turkish Historical Society is headquartered in",
"Gençlerbirliği S.K. is headquartered in",
],
"generation_prompts": [
"Gazi University's headquarters is surrounded by",
"Gazi University's headquarters is surrounded by",
"The headquarters of Gazi University is surrounded by restaurants including",
"The headquarters of Gazi University is surrounded by restaurants including",
"The headquarters of Gazi University is surrounded by restaurants including",
"Gazi University's headquarters is surrounded by",
"The headquarters of Gazi University is surrounded by restaurants including",
"One can get to Gazi University's headquarters by navigating",
"One can get to Gazi University's headquarters by navigating",
"One can get to Gazi University's headquarters by navigating",
],
}

But the actual dataset that can be found here (https://memit.baulab.info/data/dsets/counterfact.json) has a different format for paraphrase prompts. Here is an example (I'll put only the paraphrase prompts):
{
...
"paraphrase_prompts": [
"Shayna does this and Yossel goes still and dies. Danielle Darrieux, a native",
"An album was recorded for Capitol Nashville but never released. Danielle Darrieux spoke the language"
],
...
}

We notice the apparently random sentences at the start of each paraphrase prompt. The code does not seem to filter these prefixes.

If this is not an error, why there is this difference ? And what is its impact on the evaluation procedure ?

@dtamayo-nlp
Copy link

I do not know the exact details of this, but I can give you some insight. As you may know, when inserting the knowledge of prompts both methods, ROME and MEMIT, add some "noise" by adding previous tokens to the subject. They claim that:

Because the state will vary depending on tokens that precede s in text, we set $k^*$ to an average value over a small set of texts ending with the subject s.

The way in which these tokens are added is easy, but they changed it between ROME and MEMIT. While in ROME they sampled 20 texts to compute the prefix: ten of length 5 and ten of length 10, when looking at the MEMIT code there are only 5 samples of length 10. I have not found the details of this anywhere, but it seems like they have been experimenting with adding different types of noise tokens to see which one was better.

I do not think that it was an error. The reason for generating these paraphrased sentences is to test the model's ability to handle the same information presented differently. If they are deliberately adding variations to the paraphrased prompts and still obtaining good results in their evaluations, it suggests that their approach is capable of maintaining its performance even when introduced to such variations, demonstrating its robustness.

I hope it helps you a little. If someone knows the exact impact of the evaluation procedure, I will be glad to hear about it too.

@HichemAK
Copy link
Author

Thank you for your answer! Yes you may be right, they probably added these random sentences to account for the variable contexts in which these prompts can appear. But then why not adding them for the neighborhood prompts? the same reasoning applies to them too.

I hope we will hear more about what is the impact of these random sentences on the evaluation.

@dtamayo-nlp
Copy link

dtamayo-nlp commented Oct 13, 2023

I hope we will hear more about what is the impact of these random sentences on the evaluation.

Me too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants