Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git diff file path escaping #1371

Open
kemchenj opened this issue Jun 3, 2022 · 2 comments
Open

Git diff file path escaping #1371

kemchenj opened this issue Jun 3, 2022 · 2 comments
Labels

Comments

@kemchenj
Copy link

kemchenj commented Jun 3, 2022

First I want to thank you guys making this fantastic tool, it works very well and saving a lot time for our team.

And we encounter with some issues recently while working with non-ascii file path. I did some investigation and I think I should file an issue to report it.

Here is the thing, Git commands that output paths (e.g. ls-files, diff), will escape usual characters in the path with backslashes in the same way C escapes control characters.

Currently Danger handle this properly in APIs like git.added_files by using the ruby-git, which unescape the path internally.

But Danger has an separate implementation to extract informations from diff files in /lib/danger/request_sources/github/github.rb#L37, which not handle escaped path correctly. And GitHub inline comment will be affected by this.

I think maybe we could reuse some code from ruby-git, parse the diff file to a more structured ruby class before using it.

@kemchenj kemchenj changed the title Git diff file path handling Git diff file path escaping Jun 3, 2022
@manicmaniac
Copy link
Member

@kemchenj

I think maybe we could reuse some code from ruby-git, parse the diff file to a more structured ruby class before using it.

I agree with you.

BTW perhaps, if you set $LANG environment variable like LANG=en_US.UTF-8 bundle exec danger (or your file system's encoding), does the problem still reproduce?

@kemchenj
Copy link
Author

kemchenj commented Nov 13, 2022

@manicmaniac

The "weird encoding" diff file is actually fetched from GitHub. Here is the link to an example pull request, and its diff file:

diff --git "a/\346\226\207 \344\273\266 \345\244\271/\346\226\207\344\273\2662.md" "b/\346\226\207 \344\273\266 \345\244\271/\346\226\207\344\273\2662.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\346\226\207\344\273\266.md" "b/\346\226\207\344\273\266.md"
new file mode 100644
index 0000000..e69de29

The \346\226\207 \344\273\266 \345\244\271/\346\226\207\344\273\2662.md above is actually 文件.md encoded in the "Git way".

I have tried to add headers like Accept: application/vnd.github.v3.diff;charset=utf-8 or Accept-Charset: utf-8 in the request header, but the response stays the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants