Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve AAD provider performance by directly invoking Graph APIs for Cmdlets with slow load times #1094

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

tkol2022
Copy link
Collaborator

@tkol2022 tkol2022 commented May 10, 2024

🗣 Description

This code enhancement of the AAD provider improves the performance of calls to Powershell cmdlets that are part of the Microsoft.Graph.Beta.Identity.Governance module. The enhancement replaces calls to those cmdlets with direct calls to the respective MS Graph REST APIs. The reason this works to cut down the execution time is that we found that the loading of the Identity.Governance module into memory when ScubaGear first runs in a fresh Powershell instance, is commonly a slow activity, in some cases taking over a minute just to load. By calling MS Graph APIs directly we bypass the penalty associated with loading the module into memory. Based on limited testing running in a virtual machine this enhancement resulted in a 42% reduction in execution time against our E5 tenant and a 35% reduction against our G5 tenant.

closes #816

💭 Motivation and context

The AAD provider takes considerably longer to run compared to the other providers and in general runs slowly sometimes taking several minutes to complete. This creates delays for our development team who frequently run the code and have to perform testing and it slows down the automated processes that execute the code as well. Although we haven't received requests from end users to improve the performance, a positive side benefit is that their experience will improve when using the tool as well.

🧪 Testing

These code changes were tested against the E5, G5, G3 and GCC High tenants.

✅ Pre-approval checklist

  • This PR has an informative and human-readable title.
  • PR targets the correct parent branch (e.g., main or release-name) for merge.
  • Changes are limited to a single goal - eschew scope creep!
  • Changes are sized such that they do not touch excessive number of files.
  • All future TODOs are captured in issues, which are referenced in code comments.
  • These code changes follow the ScubaGear content style guide.
  • Related issues these changes resolve are linked preferably via closing keywords.
  • All relevant type-of-change labels added.
  • All relevant project fields are set.
  • All relevant repo and/or project documentation updated to reflect these changes.
  • Unit tests added/updated to cover PowerShell and Rego changes.
  • Functional tests added/updated to cover PowerShell and Rego changes.
  • All relevant functional tests passed.
  • All automated checks (e.g., linting, static analysis, unit/smoke tests) passed.

✅ Pre-merge checklist

  • PR passed smoke test check.

  • Feature branch has been rebased against changes from parent branch, as needed

    Use Rebase branch button below or use this reference to rebase from the command line.

  • Resolved all merge conflicts on branch

  • Notified merge coordinator that PR is ready for merge via comment mention

✅ Post-merge checklist

  • Feature branch deleted after merge to clean up repository.
  • Verified that all checks pass on parent branch (e.g., main or release-name) after merge.

@tkol2022 tkol2022 added the enhancement This issue or pull request will add new or improve existing functionality label May 10, 2024
@tkol2022
Copy link
Collaborator Author

tkol2022 commented May 10, 2024

Items Left to Complete

  • Remove dependency on Microsoft.Graph.Beta.Identity.Governance from RequiredVersions.ps1 & UninstallModules.ps1

  • Modify the functional tests in AAD file aad.g5.testplan.yaml to create JSON compatible with the new Rego fields for policies 7.4 & 7.5. In the Rego, the field EndDateTime was changed to endDateTime and StartDateTime was changed to startDateTime since the field that come back directly from the REST APIs start with lower case letters.

  • Update the unit tests for 7.4 & 7.5 due to the renaming of the fields as described in the prior action item

  • Run the tool with the current ScubaGear code from main branch and capture HTML output to compare the output against the 816 enhancement branch. Verify that the enhanced code produces the same policy output results for all section 7 policies in all the test tenants.

  • Run the main branch code against AAD and then run the 816 branch and log the execution times the spreadsheet for the Tenant 2 (G3) and Tenant 6 (GCC high)
    AAD before after replacement Identity.Governance.xlsx

@tkol2022 tkol2022 marked this pull request as ready for review May 29, 2024 14:20
@twneale twneale force-pushed the 816-performance-aad-provider branch from 14da017 to d18303b Compare May 29, 2024 14:44
@tkol2022 tkol2022 requested review from gdasher and adhilto May 29, 2024 15:15
@tkol2022 tkol2022 changed the title 816 performance aad provider Improve AAD provider performance by directly invoking Graph APIs for Cmdlets with slow load times May 29, 2024
@tkol2022 tkol2022 requested review from mitchelbaker-cisa and removed request for adhilto May 30, 2024 18:14
@twneale twneale force-pushed the 816-performance-aad-provider branch from bee8d47 to 2c19060 Compare May 31, 2024 18:13
@twneale twneale requested a review from gdasher May 31, 2024 19:30
@tkol2022 tkol2022 requested a review from adhilto June 3, 2024 15:53
@tkol2022
Copy link
Collaborator Author

tkol2022 commented Jun 3, 2024

For reviewers: How to test the performance improvement

The performance improvement when running AAD is mostly noticeable when the tool executes for the first time in a newly spawned Powershell window. This is because the slowness that we observed in the Microsoft.Graph.Beta.Identity.Governance dll was during its load into memory, which occurs only the first time you call one of its Cmdlets in a new Powershell window.
The steps below describe how you can test to ensure that you also see an improvement.

Use this code to take a time measurement (tailor the parameter values to your environment):

Import-Module -Name .\PowerShell\ScubaGear
$ExecutionTime = Measure-Command { Invoke-Scuba -ProductNames aad -CertificateThumbPrint "33dd41a761f434f12ee8ac3ff771347597435585" -AppID "ac691fde-1ff0-493f-be3d-03d9ba6cb891" -Organization "TEST_TENANT.onmicrosoft.com" }
$ExecutionTime
  1. Download a copy of the main branch and run the measurement code above 5-10 times. Remember to take each measurement in a new Powershell window. Compute an average time.
  2. Download a copy of the branch associated with this PR and repeat step 1.
  3. Log the average time of the main branch and this branch in this PR.

Copy link
Collaborator

@gdasher gdasher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. One remaining area around buildilng the Uri and dispatching to the different environments.

@schrolla schrolla added this to the Iceberg milestone Jun 3, 2024
Copy link
Collaborator

@adhilto adhilto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My two cents: this is faster. I ran AAD once on this branch and once on main: 108 seconds on this branch and 178 on main, so over a minute faster. And manual inspection of the two reports showed no difference in the output, as expected.

One minor piece of feedback beyond what has already been commented on: a lot of the new code does not use PascalCase, per our style guide. Not a big deal but would be nice to have for consistency.

@twneale twneale requested review from adhilto and gdasher June 7, 2024 14:19
@twneale
Copy link
Collaborator

twneale commented Jun 7, 2024

Here are the run durations I recorded:

durationtimes.xlsx

Of special note: many times when I ran the tests on either branch, invoke-scuba impossibly exited fine within 1-5 seconds. I need to disclose that I thoroughly perjured the scientific integrity of these test by completely ignoring those runs. I don't know how they went so fast.

The conclusion: for some reason, running in Tenant 2 conferred a minor performance advantage. In gcchigh it was about 60% faster on average.

@tkol2022
Copy link
Collaborator Author

tkol2022 commented Jun 7, 2024

Here are the run durations I recorded:

durationtimes.xlsx

Of special note: many times when I ran the tests on either branch, invoke-scuba impossibly exited fine within 1-5 seconds. I need to disclose that I thoroughly perjured the scientific integrity of these test by completely ignoring those runs. I don't know how they went so fast.

The conclusion: for some reason, running in Tenant 2 conferred a minor performance advantage. In gcchigh it was about 60% faster on average.

Make sure that when you run the timing, each timing has to be done in a fresh PS window because it is the loading of the dependency DLL which accounts for the time we think we are saving with our new code. The DLL loading only occurs when you first Invoke-Scuba the first time in a new window.

@twneale twneale force-pushed the 816-performance-aad-provider branch from 8cf39d6 to e178ce6 Compare June 7, 2024 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This issue or pull request will add new or improve existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

prototype a performance redesign of the AAD provider
5 participants