Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(weave): Move call queries to the backend #1656

Closed
wants to merge 105 commits into from

Conversation

tssweeney
Copy link
Contributor

@tssweeney tssweeney commented May 17, 2024

Meta Comments:
This PR pairs with https://github.com/wandb/core/pull/21622. To reviewers: I am going to split this PR into 2 parts after approval:

  1. Backend Changes: chore(weave): Call Table Migration - Backend #1691
  2. Frontend Changes: chore(weave): Call Table Migration - Frontend #1692

The rollout will be:

  1. Merge Backend Changes
  2. Merge https://github.com/wandb/core/pull/21622
  3. Deploy Trace Server
  4. Merge Frontend Changes
  5. Deploy Frontend

PR Overview:
This PR seeks to improve the performance of the CallsTable. It does so by moving the query layer (sorting, filtering, & paging) out of the MUIDataGrid and into the backend. Furthermore, it refactors ref expansion out of the component layer completely, and puts it in the network query layer (sets us up to move this to the backend next).

The benefits of this PR are:

  • Improved, non-degrading performance characteristics for large projects' CallsTables
  • Faster call counting queries around the application
  • Technical foundation for query operations on expanded fields
  • Technical foundation for "export to python" exactly what you see in the app
  • Various UX improvements for ref expansion and handling of heterogeneous column types.
  • Type-aware support for numerous filter types - including date filter on call creation.

User-facing changes:

  1. Call attributes, inputs, output, and summary are all handled exactly the same in the CallsTable now.
  2. RefExpansion feels exactly like: "Transform the ref into the data". Specifically:
    • Expanding Refs pointing to primitives are now supported
    • Expanding output is now supported
    • When you expand a ref that points to an object - the result is that it feels exactly like a dictionary is the
  3. Type-aware support for numerous filter types - including date filter on call creation.
  4. CallsTable is now paginated (100 per page)

Features Removed:
Note: I can add these back, but they have their own tradeoffs.

  1. We no longer auto-hide sparse columns
  2. We no longer have a "common columns" section
  3. "Export as CSV" limited to immediate page

Backend Changes:

  • New "mongo-style" filter can be passed to calls/query
  • New CallsQueryStats endpoint that currently exposes stats

API Design Choices to Discuss

  • Should change 'filterBy' to be part of the 'filter' clause? (probably?)
  • Should change the total count to be part of the standard query? (probably not?)

Future Features:

  • Support for Filter/Sort on expanded fields (in the backend)
  • Sorting / Filtering by username should convert the id to internal id
  • Would be nice to support sorting by common derived values like latency or version index
  • Export as CSV supports the entire resultset
  • When expanding a ref, the ref link goes away (for primitives) - would be nice to fix this

@circle-job-mirror
Copy link

circle-job-mirror bot commented May 17, 2024

@tssweeney tssweeney marked this pull request as ready for review May 29, 2024 08:04
@tssweeney tssweeney requested a review from a team as a code owner May 29, 2024 08:04
@@ -736,3 +826,71 @@ def get_base_object_class(val: Any) -> Optional[str]:
elif "_class_name" in val:
return val["_class_name"]
return None


def _quote_json_path(path: str) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience, end users will absolutely do things like have "file.ext" or a string "123" as keys in an object. A dotted notation is very convenient for the common case but we may want to consider also supporting a more verbose traversal format like object[1]["file.ext"]["42"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a summary of limitations here: 6f0e498 ... need to define this more crisply before opening up to public

weave/trace_server/interface/query.py Outdated Show resolved Hide resolved
weave/tests/test_client_trace.py Outdated Show resolved Hide resolved
weave/tests/test_client_trace.py Outdated Show resolved Hide resolved
}, [currentDynamicColumnNames, shouldIgnoreColumn]);

useEffect(() => {
if (resetDep) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the same thing in either case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah - i just want to trigger on changes to resetDep. just putting resetDep in the dep array is susceptible to linter removing. I commented. Perhaps there is a better way?

weave/trace_server/clickhouse_trace_server_batched.py Outdated Show resolved Hide resolved
weave/trace_server/clickhouse_trace_server_batched.py Outdated Show resolved Hide resolved
weave/trace_server/clickhouse_trace_server_batched.py Outdated Show resolved Hide resolved

conditions.append(filter_cond)

return conditions, raw_fields_used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is getting really big - would be nice to split it up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - i actually did refactor this, then reverted: b6668ac#diff-121d8320484a693356428be5657e6b61d53f293f5c505f49e67ccc91ac09d5e3 ... i was just touching so many things that i thought it would be better to come back and split this file up after

@tssweeney
Copy link
Contributor Author

Closing - all of this is now landed using smaller separate PRs!

@tssweeney tssweeney closed this May 29, 2024
@github-actions github-actions bot locked and limited conversation to collaborators May 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants