[Bug] Macro adapter.get_columns_in_relation() doesn't return any columns during unit testing #10139

DmytroSly · 2024-05-14T15:06:20Z

Is this a new bug in dbt-core?

I believe this is a new bug in dbt-core
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

The model I configured a unit test for, relies on the macro adapter.get_columns_in_relation() that I use for one of the models, referenced with the ref() macro. However, the macro adapter.get_columns_in_relation() returns an empty list for a relation during unit testing. This causes the SQL code to compile improperly and fail.

Expected Behavior

The macro adapter.get_columns_in_relation() should return a list of columns for a referenced relation during unit testing.

Steps To Reproduce

Create a SQL model that relies on the list of columns returned by the macro adapter.get_columns_in_relation(), then configure and run a unit test for this model

Relevant log output

No response

Environment

- OS: Windows 10 Enterprise
- Python: 3.11.4
- dbt: 1.8.0

Which database adapter are you using with dbt?

snowflake

Additional Context

I hope it is possible to fix this, since dbt knows for sure the list of columns of all the referenced models. Apart from the input columns provided in the unit test config, dbt injects all other columns with NULL values into the resulting SQL for such a unit test. The input tables for a model, that I am trying to unit test, are really wide - 70-80 columns.

DmytroSly · 2024-05-15T10:03:22Z

FYI
I tried using the override functionality to set the list of columns to be returned by the macro adapter.get_columns_in_relation(), but the problem with the override in my case is that this macro is used two times, and each time for a different model. A different list of columns should be returned for every model, but the override makes it return the same list of columns for different models.

dbeatty10 · 2024-05-16T13:16:12Z

I'm not sure if this is the same underlying issue or not, but I just opened this issue that looks similar:

[Bug] Unit test raising error '<macro name>' is undefined. This can happen when calling a macro that does not exist. #10157

dbeatty10 · 2024-05-22T13:40:59Z

Thanks for reaching out about this @DmytroSly !

Could you provide a simplified version of the models you are having trouble with?

e.g., something like this:

models/model_a.sql

select 1 as id

models/model_b.sql

{%- set columns = adapter.get_columns_in_relation(ref('model_a')) -%}

{% for column in columns %}
  {{ log("Column: " ~ column, info=true) }}
{% endfor %}

select 2 as guid

Elisabeth-Kury · 2024-05-29T09:21:38Z

I ran into the same problem when writing a unit test using the dbt.utils get_columns_in_relation function, hence why I'm adding to this issue, hope thats fine :)
I then wrote a really simplified test model that looks as follows, basically in this example I want to concat all column values except the values for the columns createdat and updatedat.

The model input seeds/test.csv is just a simple test file with one row:

id,name,mail,date,updatedat,createdat
1,John,john.example,2024-05-08,2024-05-01,2024-05-01

models/stg_test.sql

{%- set columns = get_columns_in_relation(ref('test')) -%}
{% set column_names = columns|map(attribute='name')|reject("in", ['createdat','updatedat'])|list %}

select
    *, {{ concat( column_names ) }} as row_hash_value
from {{ ref('test') }}

The dbt run is successful and the compiled code looks fine:

select
     *, id || name || mail || date as row_hash_value
from "my_dwh"."seeds"."stg_test"
    )

However when I run a dbt unit test with the same input as a csv fixture, the test fails with the following error message:

09:04:39    Runtime Error in unit_test logic_unit_test_test (models\b_staging\stg_monta\_stg_monta__unit_tests.yml)
  An error occurred during execution of unit test 'logic_unit_test_test'. There may be an error in the unit test definition: check the data types.
   Database Error
    syntax error at or near "as"
    LINE 42:             *,  as row_hash_value
                             ^

Looking at the compiled code it seems that the dbt unit test is not correctly passing the get_columns_in_relation function or the function is returning nothing:

with __dbt__cte__test as (

-- Fixture for test
select 
    
    cast('1' as integer)
 as id, 
    
    cast('John' as text)
 as name, 
    
    cast('john.example' as text)
 as mail, 
    
    cast('2024-05-08' as date)
 as date, 
    
    cast('2024-05-01' as date)
 as updatedat, 
    
    cast('2024-05-01' as date)
 as createdat
) select
    *,  as row_hash_value
from __dbt__cte__test

Hope this helps, would be great if this was resolved or anyone has an idea how to fix this behaviour.

dbeatty10 · 2024-05-29T14:47:58Z

That is very helpful @Elisabeth-Kury 🤩

Full details for a reproducible example, including unit test definition below.

Reprex

seeds/test.csv

id,name,mail,date,updatedat,createdat
1,John,john.example,2024-05-08,2024-05-01,2024-05-01

models/stg_test.sql

{%- set columns = get_columns_in_relation(ref('test')) -%}
{% set column_names = columns|map(attribute='name')|reject("in", ['createdat','updatedat'])|list %}

select
    *, {{ concat( column_names ) }} as row_hash_value
from {{ ref('test') }}

models/_unit_tests.yml

unit_tests:

  - name: test__stg_test
    model: stg_test
    given:
      - input: ref('test')
        rows:
          - {id: 2, name: Ari, mail: [email protected], date: 2024-01-01, updatedat: 2024-01-01, createdat: 2024-01-01}
    expect:
        rows:
          - {id: 2, name: Ari, mail: [email protected], date: 2024-01-01, updatedat: 2024-01-01, createdat: 2024-01-01, row_hash_value: [email protected]}

dbt seed -s test --full-refresh
dbt run -s stg_test
dbt build  -s stg_test

jtcohen6 · 2024-06-03T08:38:43Z

I think this is another case where actually materializing the fixtures as database objects would work, because then each call to get_columns_in_relation would just go the columns in each relation, without any need to mock different args / return values:

[CT-3033] [Spike] Explore support multiple unit test materialization strategies: CTE vs 'seed'-based #8499

I expect that will be significantly slower in the general case, but it would allow the framework to support more-complex models — so while it shouldn't replace the default, it should be something that users can opt into.

DmytroSly added bug Something isn't working triage labels May 14, 2024

dbeatty10 added the unit tests Issues related to built-in dbt unit testing functionality label May 14, 2024

jlucas91 mentioned this issue May 15, 2024

[Bug] --empty flag generates invalid SQL when running adapter.get_columns_in_relation dbt-labs/dbt-adapters#213

Open

2 tasks

dbeatty10 mentioned this issue May 16, 2024

[Bug] Unit test raising error '<macro name>' is undefined. This can happen when calling a macro that does not exist. #10157

Open

2 tasks

dbeatty10 added awaiting_response and removed triage labels May 22, 2024

github-actions bot added triage and removed awaiting_response labels May 29, 2024

dbeatty10 removed the triage label May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Macro adapter.get_columns_in_relation() doesn't return any columns during unit testing #10139

[Bug] Macro adapter.get_columns_in_relation() doesn't return any columns during unit testing #10139

DmytroSly commented May 14, 2024 •

edited

DmytroSly commented May 15, 2024

dbeatty10 commented May 16, 2024

dbeatty10 commented May 22, 2024

Elisabeth-Kury commented May 29, 2024

dbeatty10 commented May 29, 2024

Reprex

jtcohen6 commented Jun 3, 2024

[Bug] Macro adapter.get_columns_in_relation() doesn't return any columns during unit testing #10139

[Bug] Macro adapter.get_columns_in_relation() doesn't return any columns during unit testing #10139

Comments

DmytroSly commented May 14, 2024 • edited

Is this a new bug in dbt-core?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Which database adapter are you using with dbt?

Additional Context

DmytroSly commented May 15, 2024

dbeatty10 commented May 16, 2024

dbeatty10 commented May 22, 2024

Elisabeth-Kury commented May 29, 2024

dbeatty10 commented May 29, 2024

Reprex

jtcohen6 commented Jun 3, 2024

DmytroSly commented May 14, 2024 •

edited