Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UniqueConstraint is not applied on SQLLens #798

Open
arigaut opened this issue Mar 15, 2024 · 1 comment
Open

UniqueConstraint is not applied on SQLLens #798

arigaut opened this issue Mar 15, 2024 · 1 comment

Comments

@arigaut
Copy link

arigaut commented Mar 15, 2024

Description

uniqueConstraints are not applied to a SQLLens. I am using Snowflake as the target warehouse.

Steps to Reproduce

Lense

{
    "relations": [
{
            "name": ["lenses", "transactions"],
            "query": "SELECT tran_guid, COUNT(1) AS line_item_ct, SUM(value_usd) AS VALUE_USD FROM \"SF_DB\".\"SF_SCHEMA\".\"SALES\" GROUP BY tran_guid",
            "type": "SQLLens",
            "uniqueConstraints": {
                "added": [
                    {
                        "name": "uc",
                        "determinants": ["tran_guid"]
                    }
                ]
            },
            "nonNullConstraints": {
                "added": [
                    "tran_guid",
                    "line_item_ct",
                    "value_usd"
                ]
            }
        }
]}

Mapping

mappingId	transactions
target          :transaction/{tran_guid} a :Transaction; :line_item_ct {line_item_ct}^^xsd:int ; :revenue {value_usd}^^xsd:float .
source          SELECT tran_guid, line_item_ct, value_usd FROM lenses.transactions

SPARQL query

SELECT (SUM(?line_item_ct) AS ?totalLineItems) (SUM(?revenue) AS ?totalSales)
WHERE {
  ?transaction a :Transaction ;
               :revenue ?revenue ;
               :line_item_ct ?line_item_ct  .

Expected behavior:

An efficient SQL query of the type below:

SELECT SUM(V1.line_item_ct), SUM(V1.value_USD)
FROM (SELECT tran_guid, COUNT(1) AS line_item_ct, SUM(value_usd) AS VALUE_USD FROM "SF_DB"."SF_SCHEMA"."SALES" GROUP BY tran_guid) V1

Actual behavior:

Inefficient query that uses costly DISTINCT operations and self-joins.

SELECT SUM(CAST(V4."v1" AS DOUBLE)) AS "sum0", SUM(CAST(V4."v3" AS NUMBER(19))) AS "sum1"
FROM (SELECT DISTINCT CAST(V1.TRAN_GUID AS VARCHAR) AS "v0", CAST(V1.VALUE_USD AS VARCHAR) AS "v1", CAST(V2.LINE_ITEM_CT AS VARCHAR) AS "v3"
FROM (SELECT tran_guid, COUNT(1) AS line_item_ct, SUM(value_usd) AS VALUE_USD FROM "SF_DB"."SF_SCHEMA"."SALES" GROUP BY tran_guid) V1, (SELECT tran_guid, COUNT(1) AS line_item_ct, SUM(value_usd) AS VALUE_USD FROM "SF_DB"."SF_SCHEMA"."SALES" GROUP BY tran_guid) V2
WHERE CAST(V1.TRAN_GUID AS VARCHAR) = CAST(V2.TRAN_GUID AS VARCHAR)
) V4

Versions

Ontop 5.1.1 as a deployed SPARQL endpoint

@bcogrel
Copy link
Member

bcogrel commented Apr 14, 2024

Thanks @arigaut for reporting the issue and sorry for the late answer.

I managed to reproduce it. The unique constraint is correctly added to the lens, but is not used because the datatypes of the lens columns are not known.

By turning the following option on

ontop.allowRetrievingBlackBoxViewMetadataFromDB = true

These datatypes are getting extracted, even if the SQL query in the lens is treated as a "black-box view" as it uses algebra not supported by the current SQL parser.

We will discuss internally if it makes sense to turn that option on by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants