Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix filename encoding for attachments #11

Open
ivy opened this issue Dec 7, 2017 · 0 comments
Open

Fix filename encoding for attachments #11

ivy opened this issue Dec 7, 2017 · 0 comments
Labels

Comments

@ivy
Copy link

ivy commented Dec 7, 2017

Short summary: Non-ASCII filenames should be encoded following RFC 2047 in the name parameter of the Content-Type and RFC 2231 in the filename parameter of the Content-Disposition. For example, an attachment named 今日は世.txt might include the following headers:

Content-Type: text/plain; charset=UTF-8; name=?UTF-8?Q?=E4=BB=8A=E6=97=A5=E3=81=AF=E4=B8=96.txt?=
Content-Disposition: attachment; filename*=%E4%BB%8A%E6%97%A5%E3%81%AF%E4%B8%96.txt

Any proposed fix should first be tested against a few major email clients.


Originally reported in go-gomail#66, non-ASCII filenames are garbled when attaching files to messages. In looking at some of the proposed solutions (go-gomail#83), some people have solved this by adding charset=UTF-8 to the Content-Type header. This may work for some but that is almost certainly a coincidence which I expect causes more issues later for certain clients and attachments.

Digging into various IETF standards, none of the RFCs seem to specify filename encoding for non-ASCII characters. Take RFC 2183, section 2.3 for example:

Current [RFC 2045] grammar restricts parameter values (and hence
Content-Disposition filenames) to US-ASCII. We recognize the great
desirability of allowing arbitrary character sets in filenames, but
it is beyond the scope of this document to define the necessary
mechanisms.

This thread on the ietf-smtp mailing list seems to have the answer:

Finally, if you have to include filename information, either put it in a
filename= parameter or both a filename= and name= parameter. Never ever use
just a name= parameter because that opens you up to gratuitous interpretation
of the part using some disposition value you didn't intend. (I note in passing
that this is what Thunderbird now dows, with the added nuance of using
nonstandard RFC 2047 encoding for the name= paramter and standard RFC 2231
encoding for the filename= parameter.)

@ivy ivy added the bug label Dec 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant