Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support asof join #15411

Open
wants to merge 125 commits into
base: main
Choose a base branch
from
Open

feat: support asof join #15411

wants to merge 125 commits into from

Conversation

zenus
Copy link
Contributor

@zenus zenus commented May 6, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

  1. thanks to @xudong963 for the nice work of range join
  2. thanks to duckdb for the nice idea of refactor window functions to implement asof join

Currently, due to different ways of implementing range join, the order of results obtained by asof join is random. I may be able to get help from @xudong963。that's why i do not add any test case.

image

Benchmark

image

build side : 5w probe side : 5w

image

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label May 6, 2024
Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zenus

I think we don't need to ensure the order of results. It makes sense for different databases to have different results order.

I've skimmed through the code and there are a few points to note:

  1. todo!() needs to be replaced with code that makes sense or provides a specific error message.
  2. build_asof_join method can be further split into smaller functions.
  3. Regarding tests, if you are going to borrow the test set from duckdb, you can put the tests in the directory https://github.com/datafuselabs/databend/tree/main/tests/sqllogictests/suites/duckdb/asof_ join
  4. AsOf join can be used as an alternative to window to semantically express temporal relationships, if so, can you provide a performance comparison of asof join vs window?

In addition, I've opened a tracking issue on asof join, if you have time you can continue to finish some sub-issues in it.

@xudong963 xudong963 marked this pull request as draft May 7, 2024 11:12
@zenus
Copy link
Contributor Author

zenus commented May 7, 2024

Thanks @zenus

I think we don't need to ensure the order of results. It makes sense for different databases to have different results order.

I've skimmed through the code and there are a few points to note:

  1. todo!() needs to be replaced with code that makes sense or provides a specific error message.
  2. build_asof_join method can be further split into smaller functions.
  3. Regarding tests, if you are going to borrow the test set from duckdb, you can put the tests in the directory https://github.com/datafuselabs/databend/tree/main/tests/sqllogictests/suites/duckdb/asof_ join
  4. AsOf join can be used as an alternative to window to semantically express temporal relationships, if so, can you provide a performance comparison of asof join vs window?

In addition, I've opened a tracking issue on asof join, if you have time you can continue to finish some sub-issues in it.

nice advice , my pleasure.

@zenus zenus requested a review from Dousir9 June 1, 2024 10:00
@xudong963
Copy link
Member

Seems the pr is ready for review, please resolve the conflict

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants