Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike: Great expectations or dbt expectations #328

Open
aneiderhiser opened this issue Jan 4, 2024 · 0 comments
Open

Spike: Great expectations or dbt expectations #328

aneiderhiser opened this issue Jan 4, 2024 · 0 comments

Comments

@aneiderhiser
Copy link
Contributor

Problem: Sometimes we merge PRs that have unintended downstream consequences. For example, an upstream join can unexpectedly change the value of a metric or make it so records aren't being added to a table. Currently we have no way to see these problems until someone finds them through manual data inspection.

This issue is about researching and prototyping a possible solution, probably using either great expectations or dbt expectations. Using solutions like these we should be able to create expectations based on a static dataset like tuva synthetic. For example:

  • The readmission rate should be 10.7%
  • Core.condition should have 1537 total records

I made both of these statistics up but you get the ideal. This would be built into our pipeline so that this testing runs when we go to merge PRs. This speeds up our dev cycle time while lowering the probability of errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

2 participants