-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: Identifying slow queries at build time #7
Comments
benjie
changed the title
Brainstorm: Identifying slow queries at build time
Discussion: Identifying slow queries at build time
Apr 18, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm not sure yet how much of this involves writing a tool vs. just documenting a pattern or what, and I'm not sure to what extent it relates to this repo vs graphile-engine etc.
It would be awesome to write tests that, for each persisted operation, execute the query with EXPLAIN against the server while it points to a DB that represents production (perhaps actually being production) with representative variables.
Something maybe very loosely like this:
This would generate several files:
.persisted-operations/client.json
:.persisted-operations/xyzsha.graphql
:.persisted-operations/xyzsha-myQuery-foo.sql
:.persisted-operations/xyzsha-myQuery-bar.sql
:This way, developers can easily see the sql generated from the queries they write, as well as the query plan (imagine actually seeing where a Seq Scan is used!). Reviewers can audit it. And the build can fail if EXPLAIN thinks the cost of a query will be higher than the specified
maxcost
.Maybe this could even be part of graphile pro.
Problems / open questions:
@name
for each persisted operation instead of using a sha? Maybe what I have here would be fine-enough, and maybe even just using sha's would be fine.Failing the build on a high EXPLAIN cost estimate could cause the build to sporadically fail, for example when the statistics change.a. Silly me, this would only be the case if you re-ran the sql queries in CI, which isn't a good idea for both this reason and performance.
a. This might not be nearly as bad as it sounds, though, because you'd only re-run it when the sha of a gql query changes, and you'll probably only be doing one or two at a time.
b. Of course, you can also just run this against a staging or secondary db with similar data to production, or use a tool to regularly sync representative/censored prod data to dev db's.
maxcost
annotation would be silently exceeded for a time.a. Probably a command to regenerate all sql files, manually run periodically, would be fine for this.
b. Substantial changes to query plans that affect performance should be caught by production observability tooling anyway. The thing is to not actively write and ship queries that are slow from the beginning.
The text was updated successfully, but these errors were encountered: