New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High disk I/O prevents successful DNS resolution #5635
Comments
Pi-hole database is already kept in memory and new queries are written to disk from time to time. |
Interesting, then my guess is incorrect, but the issue is still present. How can I debug it to provide more information? |
Please, generate and upload a Debug Log: Using the command line:
Using the web interface:
|
I'm not 100% sure the issue is related to high I/O load causing a Pi-hole failure. Usually, when it fails to write to the database you see messages like these:
And when the queries are finally written, you see this:
But your log doesn't show evidence of this kind of issue (at least not in the available lines - the debug log shows only a few lines of |
Is there a more detailed log level that I can enable or something else of this kind? |
This is an interesting case. Naively thinking, I guess every added debug logging would worsen your situation somewhat as all of Pi-hole's debug options are rather verbose.
If this would not be working, something in the kernel would be broken. We are not Storing to the database is furthermore done in a different thread altogether. It should not cause slowdowns to DNS resolution. ---snip--- Well, having had to make a break from typing here and returning later to the computer actually made me realize there is one possible issue I could imagine here: When writing to the database, we are temporarily locking the datastructure holding DNS queries to ensure consistent writes. However, we are needlessly locking it for longer than we need to - so far this has never been an issue but maybe it is in your case. Fortunately, it's sufficiently easy to find out if my theory is right or not. Please run
and either find that the situation (greatly) improves or that nothing changes. |
Running it already, will report back after some time with it |
Observations so far: I do see a few seconds page load delays occasionally (which may or may not be related to DNS, didn't confirm yet), however there wasn't a single time that page didn't load at all. Before switching to that branch DSN resolution was quickly failing outright, so there is definitely a visible positive change. |
The thing that broke though is deleting diagnostic messages. I'm getting:
|
This is great, I will open a PR to merge this into the currently running Pi-hole v6.0 beta. There won't be another v5 release (this may or not change, but it is rather unlikely).
This is expected and cannot be avoided unless messages would be stored in a different database. This would increase complexity quite a bit and the solution should much more likely be running Pi-hole on a system with less load. It doesn't appear to be a good idea overall to run your Pi-hole at all on such a special system. Why are you running
Sorry, I have missed this before. We are not talking about |
This is expected and cannot be avoided unless messages would be stored in a different database. This would increase complexity quite a bit and the solution should much more likely be running Pi-hole on a system with less load.
That was just one of the ways to reproduce the issue. I was having other workloads running there that caused high disk I/O as well. Already planning to upgrade the drive today or tomorrow, but since I did see this odd behavior I thought I better report it upstream and maybe something can be improved as the result.
I see. Is some in-memory buffer a viable option though? I don't think it needs to be synchronously written to disk when DNS request comes in. Writing records in bulk might also speed thing up a bit. |
👍 definitely and thank you for this! I checked the code of the currently running Pi-hole v6.0 beta just now and see that we did already make a similar optimization having the same effect that what you tried now.
Yes, yes and yes.
|
Makes sense, I just though about some in-app buffer rather than OS caching/buffering to have better control over it and make sure blocking is not happening in Pi-hole no matter what. Looking at it from the perspective that it is better to lose logs than to impact DNS resolution performance. Either way glad to see improvements and looking forward to 6.0! |
Versions
Platform
Expected behavior
DSN resolution should work all the time
Actual behavior / bug
On very high disk I/O DNS resolution simply stops working. For example this happens when running
sync
on NVMe SSD after large disk operation that on modern Linux withnoop
I/O scheduler basically blocks a lot of things for a significant amount of time.When Pi-hope recovers I see this in logs:
Steps to reproduce
Have Pi-hope runing on disk with very high I/O load and observe DNS resolution failing.
Debug Token
https://tricorder.pi-hole.net/sgxHNmzd/
Additional context
My educated guess is that on each DNS request Pi-hole tries to read something from disk and under high I/O read times out.
Considering how small gravity database is (8.7M in my case), it would make a lot of sense to keep its copy in memory, something under 50M will not cause issues for most users, while performance will improve massively and will avoid issue described here completely.
The text was updated successfully, but these errors were encountered: