Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: support -Merge suffix on AggregateFunction types #258

Open
HacKanCuBa opened this issue Aug 23, 2023 · 1 comment
Open

Comments

@HacKanCuBa
Copy link

First of all, thanks for all the work done here!

To the point, currently there's no way to deal w/ AggregateFunction types, so we need a way to issue, say, sumMerge instead of sum, uniqMerge instead of uniq, etc.

See this example of an Aggregated Mat View:

CREATE TABLE test.visits
 (
    StartDate DateTime64 NOT NULL,
    CounterID UInt64,
    Sign Nullable(Int32),
    UserID Nullable(Int32)
) ENGINE = MergeTree ORDER BY (StartDate, CounterID);
CREATE MATERIALIZED VIEW test.mv_visits
(
    StartDate DateTime64 NOT NULL,
    CounterID UInt64,
    Visits AggregateFunction(sum, Nullable(Int32)),
    Users AggregateFunction(uniq, Nullable(Int32))
)
ENGINE = AggregatingMergeTree() ORDER BY (StartDate, CounterID)
AS SELECT
    StartDate,
    CounterID,
    sumState(Sign) AS Visits,
    uniqState(UserID) AS Users
FROM test.visits
GROUP BY StartDate, CounterID;

Selecting data:

SELECT
    StartDate,
    sumMerge(Visits) AS Visits,
    uniqMerge(Users) AS Users
FROM test.mv_visits
GROUP BY StartDate
ORDER BY StartDate;

So, doing session.query(MatViewModel.StartDate, func.sum(MatViewModel.Visits), func.uniq(MatViewModel.Users)).group_by(MatViewModel.StartDate).order_by(MatViewModel.StartDate) should produce the correct SQL syntax.

Yes, I know we can do func.sumMerge, but that is not the solution if I'm trying to transparently change the underlying DB w/o changing my query.
On this line, would you know of a workaround for this? I'm currently exploring on letting SQLAlchemy know how to compile sum for CH.

@HacKanCuBa
Copy link
Author

HacKanCuBa commented Aug 23, 2023

Current workaround:

from sqlalchemy import Numeric, Column


class AggregateFunctionType:
    pass


class AggregateFunctionNumericType(AggregateFunctionType, Numeric):
    pass


class MvVisits(AggregatingMergeTree):
    Visits = Column(AggregateFunctionNumericType(26, 12))
    # ...
from sqlalchemy.sql.functions import sum as sql_sum

@compiles(sql_sum, "clickhouse")
def clickhouse_merge_sum(
    element: sql_sum,
    compiler: SQLCompiler,
    *_: Any,
    **__: Any,
) -> str:
    if isinstance(element.type, AggregateFunctionType):
        element.name = "sumMerge"

    return compiler.visit_function(element)

So doing session.query(func.sum(MvVisits.Visits)) works as expected, issuing sumMerge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant