Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When plotting the shap text it is showing an extra letter(Ġ) before every word. #3660

Open
3 of 4 tasks
shafikrony opened this issue May 15, 2024 · 1 comment
Open
3 of 4 tasks
Labels
awaiting feedback Indicates that further information is required from the issue creator bug Indicates an unexpected problem or unintended behaviour visualization Relating to plotting

Comments

@shafikrony
Copy link

Issue Description

From May 14th, it is showing an extra letter(Ġ) before every word in a sentence when using shap.plots.text(shap_values)

Code snippet:

pred = transformers.pipeline(
"text-classification",
model=model,
tokenizer=tokenizer,
device=0,
return_all_scores=True,
)
explainer = shap.Explainer(pred)
shap_values = explainer(df["text"][33:43])
shap.plots.text(shap_values)

For example main text:
I think this broke OAuth2, diaspora-client seems to dislike oauth2 0.5"
Showd text:
I Ġthink Ġthis Ġbroke ĠO Auth 2 , Ġdi as pora - client Ġseems Ġto Ġdislike Ġo auth 2 Ġ0 . 5 "

Screenshot 2024-05-15 at 1 14 47 PM

Minimal Reproducible Example

pred = transformers.pipeline(
    "text-classification",
    model=model,
    tokenizer=tokenizer,
    device=0,
    return_all_scores=True,
)
explainer = shap.Explainer(pred)
shap_values = explainer(df["text"][33:43])
shap.plots.text(shap_values)

Traceback

No response

Expected Behavior

Shouldn't show extra letter (Ġ)
Screenshot 2024-05-15 at 1 21 15 PM
Shouldn't

Bug report checklist

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest release of shap.
  • I have confirmed this bug exists on the master branch of shap.
  • I'd be interested in making a PR to fix this bug

Installed Versions

0.45.2.dev2

@shafikrony shafikrony added the bug Indicates an unexpected problem or unintended behaviour label May 15, 2024
@CloseChoice
Copy link
Collaborator

Thanks for reporting. Your example is not reproducible, please provide a reproducible example I am afraid otherwise we wont have the capacity to figure out a model and a dataset that reproduces the issue.

Concretely we would need one script that reproduces your error, that means that the model definition, training steps, etc. and the data definition is all done within that script and does not have dependencies to any internal code/data of yours.

@CloseChoice CloseChoice added awaiting feedback Indicates that further information is required from the issue creator visualization Relating to plotting labels May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting feedback Indicates that further information is required from the issue creator bug Indicates an unexpected problem or unintended behaviour visualization Relating to plotting
Projects
None yet
Development

No branches or pull requests

2 participants