Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File server missing Vary on 206, 304 #5849

Closed
dkarlovi opened this issue Oct 2, 2023 · 38 comments · May be fixed by #6255
Closed

File server missing Vary on 206, 304 #5849

dkarlovi opened this issue Oct 2, 2023 · 38 comments · May be fixed by #6255
Labels
bug 🐞 Something isn't working
Milestone

Comments

@dkarlovi
Copy link

dkarlovi commented Oct 2, 2023

I'm checking my caching headers for assets served by file server. It all works as expected, but RedBot claims it should be adding Vary when serving 206, 304.

I've checked Caddy site itself that it's not some misconfiguration on my end, but it doesn't seem to be, the case there seems even worse:
https://redbot.org/?uri=https://caddyserver.com/resources/css/docs.css?v=dfd5278

REDbot-https-caddyserver-com-resources-css-docs-css-v-dfd5278-

It seems the Vary needs to be sent in both cases.

@mholt mholt added bug 🐞 Something isn't working help wanted 🆘 Extra attention is needed labels Oct 3, 2023
@mholt
Copy link
Member

mholt commented Oct 3, 2023

Hm, interesting. I didn't know that (guess I didn't read the spec.)

I'm currently attending to a hospital stay and will be delayed with my work a bit. Is there anyone who would like to pick this up?

@dkarlovi
Copy link
Author

dkarlovi commented Oct 3, 2023

@mholt I can try, but I'm a Golang newbie and will probably need a bit of guidance, it that's not too annoying?

@mholt
Copy link
Member

mholt commented Oct 4, 2023

@dkarlovi No that should be fine! I am a bit swamped lately (as you can see I'm behind on issues and PRs) -- with a hospital stay I'm attending to, but I will do my best to offer feedback.

If it ends up being too much, it's OK to let someone else take it on. But you're certainly welcome!

@dkarlovi
Copy link
Author

dkarlovi commented Oct 4, 2023

@mholt thanks, I'll try to take a stab at it, it's no rush since nobody complained until now. 😆

Hope the hospital sees the back of you ASAP, all the best with that, take care.

@francislavoie
Copy link
Member

I think this might be pretty tricky to solve, because Caddy's file_server and encode are quire detached from eachother. Plus, some of the headers are written by the Go stdlib itself, and aren't controlled by Caddy directly. I don't think it'll be impossible though, just quite tricky to get right.

@fortum-vaanavil
Copy link

I could also take a stab at this since I've lately been digging into file_server quite a bit. @dkarlovi , what's your situation here?

@dkarlovi
Copy link
Author

@fortum-vaanavil TBH since I've read @francislavoie comment about it being non-trivial, I got discouraged about being able to do it so please feel free to take the lead here, thank you!

@francislavoie
Copy link
Member

Sorry 😓 didn't mean to be discouraging, just wanted to manage expectations.

PRs definitely welcome. But make sure to take care to not break abstractions (encode and file_server must not depend on eachother).

@dkarlovi
Copy link
Author

No problems, I didn't interpret it as hostile, on the contrary: you've warned this might not be the best place for me to try getting into Go (and Caddy's codebase) so I'll find some other place to contribute, don't worry. 👍 Thanks!

@mholt mholt closed this as completed in 3067074 Apr 18, 2024
@mholt
Copy link
Member

mholt commented Apr 18, 2024

@dkarlovi Can you provide a little more information about the kinds of requests and responses that have these validation errors?

Preferably a minimal config and some curl -v commands that can reproduce the scenarios tested by your validation tool.

That said, I believe I have a commit that should fix your initial report if I guessed right. I did my best to reproduce a likely scenario locally and verified that my browser was still able to use its cached values with the patched logic. Could you please try the latest on master to confirm?

Thanks!

@mholt mholt removed the help wanted 🆘 Extra attention is needed label Apr 18, 2024
@mholt mholt added this to the v2.8.0 milestone Apr 18, 2024
@JeDaYoshi
Copy link

JeDaYoshi commented Apr 18, 2024

@mholt Most warnings that REDbot reported as shown in here have been fixed (checked just now), except for the "Vary" one.
"Vary" header is still not being sent for requests that can be encoded (aka. those served by file_server), when the client didn't send "Accept-Encoding".

@JeDaYoshi
Copy link

Actually, nevermind - it still gives a warn when "If-None-Match" is used:

@mholt
Copy link
Member

mholt commented Apr 18, 2024

Thanks for the update, but can you provide more details? Otherwise I'm just guessing. What is a request and response that have the headers wrong? How can I reproduce these results?

@dkarlovi
Copy link
Author

dkarlovi commented Apr 18, 2024

@mholt RedBot is OSS https://github.com/mnot/redbot you should be able to use it to lint against your local Caddy instance easily, I'm not sure if it provides a way to prepare copy/pastable curl requests.

If not, maybe a feature suggestion on the repo?

About your question, IMO the fix should be that 304 keeps the same Vary 200 would produce. The ETag one I'm not exactly sure.

@JeDaYoshi
Copy link

JeDaYoshi commented Apr 18, 2024

About your question, IMO the fix should be that 304 keeps the same Vary 200 would produce. The ETag one I'm not exactly sure.

I'm not sure about the ETag one myself either, will check in a bit.
However, the Vary error is not that. It's that compressed responses send Vary: Accept-Encoding, but uncompressed ones don't send any Vary at all, when it should send the same header in all requests that can be encoded, whether they are or not.

I'll provide proper examples in a bit.

@mholt
Copy link
Member

mholt commented Apr 18, 2024

Ahh thanks, didn't realize I could run it locally. I'll try it soon. Let me know if you have more info in the meantime.

@mholt
Copy link
Member

mholt commented Apr 18, 2024

I'm having trouble using Redbot locally (getting "Access Error" when I try on localhost -- I guess it doesn't allow localhost access?? I'm not sure how to override that), so yeah, until I can figure that out, any more info you have would be great.

@dkarlovi
Copy link
Author

@mholt are you running it from Docker? RedBot running in Docker wouldn't have access to Caddy running directly on your host. 🤔 I don't remember having any issues running it locally before, but maybe something changed in the meantime.

I don't see why it would though because RedBot is itself just doing basically specific Curl requests and asserting against expectations, there isn't really anything inherently wrong with localhost for it to not work.

@mholt
Copy link
Member

mholt commented Apr 18, 2024

No, I used pipx install redbot. I'm hitting this error: mnot/redbot#339

@francislavoie
Copy link
Member

@mholt the Makefile has a cli command which uses the config.txt as an argument

@mholt
Copy link
Member

mholt commented Apr 18, 2024

Oh -- the config gets compiled in 😮

Hm, that only seems to be for the server. I just use the CLI...

@mholt
Copy link
Member

mholt commented Apr 18, 2024

@dkarlovi To clarify, the only remaining issue this one? i.e. this one is fixed?

@dkarlovi
Copy link
Author

I'll need to check, will get back to you.

@JeDaYoshi
Copy link

JeDaYoshi commented Apr 18, 2024

@mholt Both issues are present. My screenshots are on the latest commit @ master.
The issue with Vary I already mentioned it in this comment. It is different to the one initially reported on this issue, which was fixed.

@mholt
Copy link
Member

mholt commented Apr 18, 2024

Oh sorry, I pinged the wrong person. Thanks @JeDaYoshi 😅

How are you testing locally with REDbot?

@JeDaYoshi
Copy link

I'm not testing locally. I am using the hosted instance against a Caddy instance on a VPS.

@francislavoie
Copy link
Member

francislavoie commented Apr 18, 2024

Oh -- the config gets compiled in 😮

No it doesn't, the Makefile is used as a "tooling shortcut" (a Makefile is just a way to declare a bunch of bash scripts, in a really terrible awful garbage archaic format 😂 ). The python script takes config.txt as a CLI argument at startup.

@mholt
Copy link
Member

mholt commented Apr 18, 2024

I see; I think I am using the CLI though, not the server daemon.

@mholt
Copy link
Member

mholt commented Apr 19, 2024

@JeDaYoshi @dkarlovi Ok, I got the daemon running locally, security risks and all 😁

I think I was able to fix at least some of the Vary notices, please check with d00824f.

@JeDaYoshi I'm still trying to find a time when the "response status is different when content negotiation happens" notice appears, but now that I've fixed the Vary it might actually fix that too. Can you please verify "Vary-fi"?

@JeDaYoshi
Copy link

JeDaYoshi commented Apr 19, 2024

Seems like adding a header Vary "XXX" will add two separate Vary headers instead of merging them together, whether it's a header Vary "Accept-Encoding" or any other value. However, the Vary error reported by REDbot is now fixed.


Actually, nevermind - it still gives a warn when "If-None-Match" is used:

This is still present, too, but I'm still not sure on what triggers it. It's only present when the only extra header provided is "If-None-Match".

Additionally adding "Accept-Encoding" alongside it triggers this:

Keep in mind this is using precompressed files.

@mholt
Copy link
Member

mholt commented Apr 19, 2024

Seems like adding a header Vary "XXX" will add two separate Vary headers instead of merging them together, whether it's a header Vary "Accept-Encoding" or any other value. However, the Vary error reported by REDbot is now fixed.

I'm pretty sure that's allowed. It simplifies processing at least :)

Keep in mind this is using precompressed files.

Ah, let me try that next. That's new information.

@JeDaYoshi
Copy link

JeDaYoshi commented Apr 19, 2024

I'm pretty sure that's allowed. It simplifies processing at least :)

Yeah, but shouldn't hasVaryValue avoid adding an extra header if it already exists? Because right now both cases can happen if header Vary is defined, as examples:

Vary: Origin
Vary: Accept-Encoding

or..

Vary: Accept-Encoding
Vary: Accept-Encoding

(admittedly, I'm not sure why would you define Accept-Encoding in a header yourself. I'm just asking because the function exists.)

Ah, let me try that next. That's new information.

My apologies - thought I mentioned it, and just noticed a bit ago I didn't!

@mholt
Copy link
Member

mholt commented Apr 19, 2024

Yeah, but shouldn't hasVaryValue avoid adding an extra header if it already exists?

No, that function specifically prevents repeating a header name in the value of the Vary header. i.e.:

Vary: Accept, Accept-Encoding
Vary: Accept-Encoding

is prevented. Semantically, it doesn't matter how many times a header field like this appears.

@mholt
Copy link
Member

mholt commented Apr 19, 2024

@JeDaYoshi I'll need more information as I can't reproduce it with REDbot.

Please provide your full (but minimalistic) config as well as the request(s) and/or REDbot config that yields those results.

@JeDaYoshi
Copy link

JeDaYoshi commented Apr 27, 2024

My apologies for not getting back to you about this, @mholt. Got busy for a while.

After thinking about it for a bit, I realised now what is the issue: the ETag header is different for precompressed files, depending on the encoding that's asked for (or if no encoding is asked for at all).

This is not a situation you run into if you're compressing the requests real-time with encode, since the ETag header is all the same.

But let's do an example:

http://:80 {
  root * /var/www/html
  file_server {
    precompressed br zstd gzip
  }
}

And in /var/www/html, you've got multiple files, serving for the non-compressed and compressed versions:

-rw-r--r-- 1 caddy caddy     484 2024-03-26 08:13:21.133755411 +0000 index.html
-rw-r--r-- 1 caddy caddy     201 2024-03-26 08:13:21.000000000 +0000 index.html.br
-rw-r--r-- 1 caddy caddy     334 2024-03-26 08:13:21.133755411 +0000 index.html.gz
-rw-r--r-- 1 caddy caddy     329 2024-03-26 08:13:21.133755411 +0000 index.html.zst

This is where the issue runs into. Those files are not using an ETag file, but instead uses calculateEtag, which uses the modification time and file size:

func calculateEtag(d os.FileInfo) string {
mtime := d.ModTime()
if mtimeUnix := mtime.Unix(); mtimeUnix == 0 || mtimeUnix == 1 {
return "" // not useful anyway; see issue #5548
}
var sb strings.Builder
sb.WriteRune('"')
sb.WriteString(strconv.FormatInt(mtime.UnixNano(), 36))
sb.WriteString(strconv.FormatInt(d.Size(), 36))
sb.WriteRune('"')
return sb.String()
}

The ETag headers don't match between requests:

$ curl http://localhost -si
[...]
etag: "d03j4q2v0tg3dg"

$ curl http://localhost -si --header "Accept-Encoding: br"
[...]
content-encoding: br
etag: "d03j4q0ndz405l"

$ curl http://localhost -si --header "Accept-Encoding: gzip"
[...]
content-encoding: gzip
etag: "d03j4q2v0tg39a"

$ curl http://localhost -si --header "Accept-Encoding: zstd"
[...]
content-encoding: zstd
etag: "d03j4q2v0tg395"

And that's what REDbot complains about when sending a If-None-Match header, as it expects the header to be the same no matter the encoding you're asking for. It's inconsistent behaviour, too.

@JeDaYoshi
Copy link

Sorry for the double-ping @mholt, just wanting to check if my response wasn't dismissed accidentally :)

@mholt
Copy link
Member

mholt commented May 1, 2024

Yeah I saw it, sorry, just been busy working on the 2.8 release

@mholt
Copy link
Member

mholt commented May 9, 2024

@JeDaYoshi Finally having a moment to investigate what you wrote...

Are you sure that's correct? I am probably misunderstanding.

I set up a test file with 3 precompressed variants like what you have, and made them all have the same Etag for every Accept-Encoding. Then redbot tells me:

❌ The ETag doesn't change between negotiated representations.

HTTP requires that the ETags for two different responses associated with the same URI be different as well, to help caches and other receivers disambiguate them.

This resource, however, sent the same strong ETag for both its compressed and uncompressed versions (negotiated by Accept-Encoding). This can cause interoperability problems, especially with caches.

Note that some versions of the Apache HTTP Server sometimes send the same ETag for both compressed and uncompressed versions of a resource. This is a known bug.

When I change it back to how it is now (different Etags for each representation), the error goes away and I get all green/black statuses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants