Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rule: Unused table in join #5251

Open
3 tasks done
olagjo opened this issue Sep 28, 2023 · 2 comments · May be fixed by #5266
Open
3 tasks done

Rule: Unused table in join #5251

olagjo opened this issue Sep 28, 2023 · 2 comments · May be fixed by #5266
Labels
enhancement New feature or request

Comments

@olagjo
Copy link
Contributor

olagjo commented Sep 28, 2023

Search before asking

  • I searched the issues and found no similar issues.

Description

It would be nice with a rule that warns about unused tables in joins.
I realize that it might need to be opt-in, since there are legitimate reasons to do this, but it is often a bug.

There are already rules for unused aliases and unused CTEs, but first aliasing a CTE, and then joining it in without referring to any of its columns is an edge-case that falls between these rules. I really couldn't find any other issues that matched exactly this use case, but if I overlooked something, I'm sorry!

Use case

If, during the development or the lifetime of a view, you have reduced which columns are selected to the point that a joined-in table is no longer referenced, it would be nice to be warned.

For example if you have had a view

with 
  widget as (
    select * from some_widget_dbt_table
  ),

  inventor as (
    select * from some_inventor_dbt_table
  )

select
  widget.id,
  widget.name,
  inventor.name
from
  widget
  left join inventor
    on widget.inventor_id = inventor.id

And then you decided at some point that the inventor name doesn't need to be in the output any more, and you get

select
  widget.id,
  widget.name,
from
  widget
  left join inventor
    on widget.inventor_id = inventor.id

Here, the join with inventor is no longer necessary, but in a big select-statement, you might easily overlook this.

Dialect

I don't think this is specific to any dialects?
But it is particularly helpful in dbt, where convention often leads you to define your "inputs" as CTEs with aliases, and then use those aliases in the join, immediately making them "used".

Are you willing to work on and submit a PR to address the issue?

  • Yes I am willing to submit a PR!

Code of Conduct

@olagjo olagjo added the enhancement New feature or request label Sep 28, 2023
@fmms
Copy link
Contributor

fmms commented Sep 29, 2023

Good idea, though at least in tsql the qualified column reference with the name of alias is optional. Thus this would only work if all columns are referenced with their qualified names…

@danparizher danparizher linked a pull request Oct 2, 2023 that will close this issue
1 task
@rachel-1
Copy link

Any updates on this? The linked PR from @danparizher looks super close :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants