Skip to content

Finding source table for columns #522

Closed Answered by barakalon
slp-oozzyy asked this question in Q&A
Discussion options

You must be logged in to vote

Scope is indeed the way to do this:

import sqlglot
from sqlglot import expressions as exp
from sqlglot.optimizer.scope import traverse_scope


ast = sqlglot.parse_one("SELECT ...")
physical_columns = [
    c
    for scope in traverse_scope(ast)
    for c in scope.columns
    if isinstance(scope.sources.get(c.table), exp.Table)
]

Now, this only works if the columns are qualified. Qualified means the column is prefixed with the table name. To illustrate - sqlglot can't disambiguate columns in this query without knowing the schema:

unqualified = """
SELECT 
  a,
  b,
FROM physical_table
JOIN (
  SELECT *
  FROM physical_table2
) AS derived_table
"""

If you're dealing with columns that might …

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@slp-oozzyy
Comment options

@barakalon
Comment options

@yuchaofan13
Comment options

@barakalon
Comment options

Answer selected by tobymao
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants