-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCD Type 2 from periodic snapshots #2009
Labels
Improvement
Improves existing functionality
Comments
Adding a +1, would be ideal to have a 3rd model that combines the two SCD2 model types where I can specify the date (in this case a snapshot date or loaded at) and the columns to check for changes, when changes are detected the date column is used if not row is ignored |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This PR: #1997 adds a new way of maintaining a SCD Type 2 model from detecting changes to the source table's columns.
This issue is for tracking the idea of extending this behaviour to build a SCD Type 2 from a table that contains periodic snapshots of the source data.
Imagine a source table that looks like this:
And some periodic process that takes snapshot of this data and makes those snapshots available in another table, e.g.:
These snaphsots allow tracking the changes that were made to the individual rows (by comparing the values), but it also contains a timestamp that can be used to determine when those changes occured. As such, a SCD Type 2 dimension can be built from this data which might look like this:
Ideally, the new
SCD_TYPE_2_BY_COLUMN
model kind would allow specifying a column (snapshot_date
in this case) as the timestamp to use for determining when a row has changed instead of usingexecution_time
.The text was updated successfully, but these errors were encountered: