Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

key16 aggregation method should be two level #63666

Open
alexey-milovidov opened this issue May 12, 2024 · 0 comments · May be fixed by #63667
Open

key16 aggregation method should be two level #63666

alexey-milovidov opened this issue May 12, 2024 · 0 comments · May be fixed by #63667
Assignees
Labels
performance warmup task The task for new ClickHouse team members. Low risk, moderate complexity, no urgency.

Comments

@alexey-milovidov
Copy link
Member

Currently, key16 aggregate states cannot be merged in parallel:

milovidov-pc :) SELECT number % 10000 AS k, uniq(number) AS u FROM numbers_mt(1e9) GROUP BY k ORDER BY u DESC LIMIT 10

SELECT
    number % 10000 AS k,
    uniq(number) AS u
FROM numbers_mt(1000000000.)
GROUP BY k
ORDER BY u DESC
LIMIT 10

Query id: d3e22e17-1a32-4615-bf1a-a2da6e0510eb

    ┌────k─┬──────u─┐
 1. │ 4759 │ 101196 │
 2. │ 4587 │ 101079 │
 3. │ 6178 │ 101034 │
 4. │ 6567 │ 101032 │
 5. │ 9463 │ 101013 │
 6. │  298 │ 101009 │
 7. │ 2049 │ 100993 │
 8. │ 8167 │ 100989 │
 9. │ 5530 │ 100973 │
10. │ 1968 │ 100973 │
    └──────┴────────┘

10 rows in set. Elapsed: 62.793 sec. Processed 1.00 billion rows, 8.00 GB (15.93 million rows/s., 127.40 MB/s.)
Peak memory usage: 11.30 GiB.

milovidov-pc :) SELECT 0 + number % 10000 AS k, uniq(number) AS u FROM numbers_mt(1e9) GROUP BY k ORDER BY u DESC LIMIT 10

SELECT
    0 + (number % 10000) AS k,
    uniq(number) AS u
FROM numbers_mt(1000000000.)
GROUP BY k
ORDER BY u DESC
LIMIT 10

Query id: e6a24292-54cf-47cb-8e39-81584736d41a

    ┌────k─┬──────u─┐
 1. │ 4759 │ 101196 │
 2. │ 4587 │ 101079 │
 3. │ 6178 │ 101034 │
 4. │ 6567 │ 101032 │
 5. │ 9463 │ 101013 │
 6. │  298 │ 101009 │
 7. │ 2049 │ 100993 │
 8. │ 8167 │ 100989 │
 9. │ 5530 │ 100973 │
10. │ 1968 │ 100973 │
    └──────┴────────┘

10 rows in set. Elapsed: 8.547 sec. Processed 1.00 billion rows, 8.00 GB (116.99 million rows/s., 935.95 MB/s.)
Peak memory usage: 10.09 GiB.
@alexey-milovidov alexey-milovidov added feature warmup task The task for new ClickHouse team members. Low risk, moderate complexity, no urgency. performance and removed feature labels May 12, 2024
@ucasfl ucasfl self-assigned this May 12, 2024
@ucasfl ucasfl linked a pull request May 12, 2024 that will close this issue
31 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance warmup task The task for new ClickHouse team members. Low risk, moderate complexity, no urgency.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants