Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(SIP-95): permissions for catalogs #28317

Merged
merged 3 commits into from May 6, 2024
Merged

Conversation

betodealmeida
Copy link
Member

@betodealmeida betodealmeida commented May 2, 2024

SUMMARY

Continuing the work on #22862. This PR:

  • Adds catalog support to the Postgres DB engine spec (in Postgres, a catalog is called a "database").
  • Adds a new endpoint /catalogs/ under the DB API, to list catalogs.
  • Adds an optional parameter catalog to the /schemas/ DB API, to specify a non-default catalog.
  • Updates the security manager and model to manage catalog-level permissions (automatically creating, renaming, and deleting them, similar to how we do for for schemas).

These are all backend changes that shouldn't affect the current behavior of Superset, other than introducing the new permissions. Note that as of today users can query different catalogs in SQL Lab, and even create (virtual) datasets in different catalogs, and our SQL parser is able to handle catalogs correctly; because of that, introducing these new permissions can be considered a security improvement.

In the next PR I'll start working on the UI to expose the catalog functionality, and enabling support for multi-catalog in a given database will be optional and default to false.

One thing that I'm not sure about is when we should rename existing schema permissions from [db].[schema] to [db].[default-catalog].[schema]. In this PR I'm doing it when the database gets updated. I don't think doing it in a migration is a good solution, because we'd have to add a new migration every time a DB engine spec is updated to support catalogs. Another option would be to do it during superset init@dpgaspar do you have any thoughts here?

Edit: added a migration for the schema permissions, done by a helper function, and a comment in the base DB engine spec informing that when supports_catalog is set to True a migration is necessary.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Screen Shot 2024-05-01 at 1 34 33 PM

TESTING INSTRUCTIONS

I've updated existing tests, and I'm adding new tests for all the changes. The PR is draft so it's easier to track which changes have been covered with tests.

ADDITIONAL INFORMATION

  • Has associated issue: [SIP-95] Proposal for Catalog Support in SQL Lab #22862
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@github-actions github-actions bot added risk:db-migration PRs that require a DB migration api Related to the REST API labels May 2, 2024
@@ -127,7 +127,7 @@ class BigQueryEngineSpec(BaseEngineSpec): # pylint: disable=too-many-public-met

allows_hidden_cc_in_orderby = True

supports_catalog = True
supports_catalog = False
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to false here and in other DB engine specs, since the specs should only have it set to true if they implement their own get_default_catalog. I'll update the DB engine specs in the near future.

Comment on lines +122 to +124
should_cache = kwargs.pop("cache", True)
force = kwargs.pop("force", False)
cache_timeout = kwargs.pop("cache_timeout", 0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current way memoized_func works is that the decorated function needs to take these parameters, even though they are only used by the decorator. This makes the function tightly coupled with the decorator, which is bad design.

I've fixed the decorator to pop these attributes, and removed them from the decorated functions.

@betodealmeida betodealmeida force-pushed the postgres-catalog branch 2 times, most recently from 48df814 to 02ad2c0 Compare May 2, 2024 12:33
@betodealmeida betodealmeida force-pushed the postgres-catalog branch 12 times, most recently from 9578fdc to 2e3b467 Compare May 3, 2024 20:14
@betodealmeida betodealmeida force-pushed the postgres-catalog branch 2 times, most recently from d56dafd to f4c3734 Compare May 3, 2024 21:16
@betodealmeida betodealmeida marked this pull request as ready for review May 3, 2024 21:42
@betodealmeida betodealmeida requested a review from a team as a code owner May 3, 2024 21:42
@betodealmeida betodealmeida force-pushed the postgres-catalog branch 3 times, most recently from 07d3c3e to 89834d0 Compare May 5, 2024 02:03
Copy link
Member

@dpgaspar dpgaspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I vote for db migrations instead of superset init

superset/databases/api.py Outdated Show resolved Hide resolved
@betodealmeida betodealmeida force-pushed the postgres-catalog branch 2 times, most recently from 634a213 to 79fde57 Compare May 6, 2024 15:05
@betodealmeida betodealmeida merged commit e90246f into master May 6, 2024
30 checks passed
@rusackas rusackas deleted the postgres-catalog branch May 6, 2024 15:45
"""
Get all schemas from database

:param inspector: SqlAlchemy inspector
:return: All schemas in the database
"""
return sorted(inspector.get_schema_names())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@betodealmeida @dpgaspar et al. the reason I believe this was a sorted list (as opposed to a set) was to ensure the results were always consistently rendered in SQL Lab et al. I don't disagree that a set is a more appropriate container, but it's contents aren't ordered. Maybe we should being using the ordered-set package for improved transparency.

CC @michael-s-molina

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks! I'll update the logic in SQL Lab to show it sorted there then.

imancrsrk pushed a commit to imancrsrk/superset that referenced this pull request May 10, 2024
jzhao62 pushed a commit to jzhao62/superset that referenced this pull request May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Related to the REST API preset-io risk:db-migration PRs that require a DB migration size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants