Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erroneous behaviour when cancelling a program #134

Open
jaxxstorm opened this issue Apr 7, 2023 · 2 comments
Open

Erroneous behaviour when cancelling a program #134

jaxxstorm opened this issue Apr 7, 2023 · 2 comments
Labels
customer/feedback Feedback from customers kind/enhancement Improvements or new features

Comments

@jaxxstorm
Copy link
Contributor

What happened?

When running a pulumi program, especially with automation API, sending a single Ctrl+C to the execution hangs the program indefinitely

Expected Behavior

When sending a Ctrl+C to a Pulumi invocation, I'd expect the program to try and gracefully shut down and terminate

Steps to reproduce

Customer sent the following program that exhibits this behaviour

using System;
using System.Collections.Generic;
using System.ComponentModel.DataAnnotations;
using System.Threading;
using System.Threading.Tasks;
using McMaster.Extensions.CommandLineUtils;
using Pulumi;
using Pulumi.Automation;
using Pulumi.Aws;
using Pulumi.Aws.S3;

namespace Snap;

[Command("pulumi", Description = "Deploy Pulumi stack")]
public class DeployPulumiCommand
{
    [Required]
    [Option("--project-name",
        "Pulumi project name",
        CommandOptionType.SingleValue,
        ShowInHelpText = true)]
    private string ProjectName { get; set; }

    public async Task<int> OnExecuteAsync(CancellationToken cancellationToken)
    {
        PulumiFn program = PulumiFn.Create(async () =>
        {
            var region = "us-west-2";
            var provider = new Provider(region, new()
            {
                Region = region,
            });

            var bucket = new Bucket("some-bucket", BucketArgs.Empty, new CustomResourceOptions { Provider = provider });
        });

        return await UpdatePulumiAsync(program, ProjectName, "test-stack", cancellationToken);
    }

    private static async Task<int> UpdatePulumiAsync(PulumiFn program, string projectName, string stackName, CancellationToken cancellationToken)
    {
        var stack = await SetupAsync(projectName, stackName, program, cancellationToken);

        Console.WriteLine("updating stack...");
        var result = await stack.UpAsync(new UpOptions
        {
            OnStandardOutput = Console.WriteLine,
            Parallel = Int32.MaxValue,
            Color = "always",
        }, cancellationToken);

        if (result.Summary.ResourceChanges != null)
        {
            Console.WriteLine("update summary:");
            foreach (var change in result.Summary.ResourceChanges)
                Console.WriteLine($"    {change.Key}: {change.Value}");
        }

        return 0;
    }

    private static async Task<WorkspaceStack> SetupAsync(string projectName, string stackName, PulumiFn program, CancellationToken cancellationToken)
    {
        var stackArgs = new InlineProgramArgs(projectName, stackName, program);
        var plugins = new Dictionary<string, string>
        {
            { "aws", "v4.24.1" },
        };
        var config = new Dictionary<string, ConfigValue>
        {
            { "aws:region", new ConfigValue("us-west-2") },
            { "aws-native:region", new ConfigValue("us-west-2") },
        };

        var stack = await LocalWorkspace.CreateOrSelectStackAsync(stackArgs, cancellationToken);
        Console.WriteLine("successfully initialized stack");

        Console.WriteLine("installing plugins...");
        foreach (var kvp in plugins)
        {
            await stack.Workspace.InstallPluginAsync(kvp.Key, kvp.Value, cancellationToken: cancellationToken);
        }
        Console.WriteLine("plugins installed");

        Console.WriteLine("setting up config...");
        foreach (var kvp in config)
        {
            await stack.SetConfigAsync(kvp.Key, kvp.Value, cancellationToken: cancellationToken);
        }
        Console.WriteLine("config set");

        Console.WriteLine("refreshing stack...");
        await stack.RefreshAsync(new RefreshOptions { OnStandardOutput = Console.WriteLine }, cancellationToken);
        Console.WriteLine("refresh complete");

        return stack;
    }
}

Output of pulumi about

N/A

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

@jaxxstorm jaxxstorm added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team labels Apr 7, 2023
@justinvp justinvp transferred this issue from pulumi/pulumi Apr 10, 2023
@justinvp justinvp removed the needs-triage Needs attention from the triage team label Apr 10, 2023
@justinvp
Copy link
Member

Potentially related to #124

@justinvp
Copy link
Member

Need to actually try it out, but just from looking, it looks like McMaster.Extensions.CommandLineUtils is handling CTRL+C:

https://github.com/natemcmaster/CommandLineUtils/blob/81199c7ec68367ea9612be55719dc1fe08f658da/src/Hosting.CommandLine/Internal/CommandLineLifetime.cs#L83-L88

Is the parent process handling and cancelling the SIGINT, but child processes still being killed?

@justinvp justinvp added the customer/feedback Feedback from customers label Dec 11, 2023
tgummerer added a commit to pulumi/pulumi that referenced this issue Dec 13, 2023
These tests are pretty close to hitting the timeout during regular
runs (e.g. one successful run I picked randomly took ~194000ms).  The
GitHub action runners don't have super reliable performance, so we can
easily get pushed over this limit.

To make matters worse here, if we hit this timeout just at the right
time, the test doesn't exit cleanly (potentially related to
pulumi/pulumi-dotnet#134), as I've seen a
`pulumi preview` process stick around in that case, preventing the
tests from shutting down).  This means we have to wait until the CI
job times out after an hour until the failure is reported.

I'm not sure if we only got close to this timeout recently, or if this
is a longer standing issue, but it reproduces well for me locally.
Ideally of course the tests would be faster, but this is still "only"
~5min which is much faster than other tests, and should hopefully
reduce the amount of times we need to go through the merge queue,
saving a lot of time there.
github-merge-queue bot pushed a commit to pulumi/pulumi that referenced this issue Dec 13, 2023
These tests are pretty close to hitting the timeout during regular runs
(e.g. one successful run I picked randomly took ~194000ms). The GitHub
action runners don't have super reliable performance, so we can easily
get pushed over this limit.

To make matters worse here, if we hit this timeout just at the right
time, the test doesn't exit cleanly (potentially related to
pulumi/pulumi-dotnet#134), as I've seen a
`pulumi preview` process stick around in that case, preventing the tests
from shutting down). This means we have to wait until the CI job times
out after an hour until the failure is reported.

I'm not sure if we only got close to this timeout recently, or if this
is a longer standing issue, but it reproduces well for me locally.
Ideally of course the tests would be faster, but this is still "only"
~5min which is much faster than other tests, and should hopefully reduce
the amount of times we need to go through the merge queue, saving a lot
of time there.

Fixes #14842

## Checklist

- [ ] I have run `make tidy` to update any new dependencies
- [ ] I have run `make lint` to verify my code passes the lint check
  - [ ] I have formatted my code using `gofumpt`

<!--- Please provide details if the checkbox below is to be left
unchecked. -->
- [ ] I have added tests that prove my fix is effective or that my
feature works
<!--- 
User-facing changes require a CHANGELOG entry.
-->
- [ ] I have run `make changelog` and committed the
`changelog/pending/<file>` documenting my change
<!--
If the change(s) in this PR is a modification of an existing call to the
Pulumi Cloud,
then the service should honor older versions of the CLI where this
change would not exist.
You must then bump the API version in
/pkg/backend/httpstate/client/api.go, as well as add
it to the service.
-->
- [ ] Yes, there are changes in this PR that warrants bumping the Pulumi
Cloud API version
<!-- @pulumi employees: If yes, you must submit corresponding changes in
the service repo. -->
github-merge-queue bot pushed a commit to pulumi/pulumi that referenced this issue Dec 13, 2023
These tests are pretty close to hitting the timeout during regular runs
(e.g. one successful run I picked randomly took ~194000ms). The GitHub
action runners don't have super reliable performance, so we can easily
get pushed over this limit.

To make matters worse here, if we hit this timeout just at the right
time, the test doesn't exit cleanly (potentially related to
pulumi/pulumi-dotnet#134), as I've seen a
`pulumi preview` process stick around in that case, preventing the tests
from shutting down). This means we have to wait until the CI job times
out after an hour until the failure is reported.

I'm not sure if we only got close to this timeout recently, or if this
is a longer standing issue, but it reproduces well for me locally.
Ideally of course the tests would be faster, but this is still "only"
~5min which is much faster than other tests, and should hopefully reduce
the amount of times we need to go through the merge queue, saving a lot
of time there.

Fixes #14842

## Checklist

- [ ] I have run `make tidy` to update any new dependencies
- [ ] I have run `make lint` to verify my code passes the lint check
  - [ ] I have formatted my code using `gofumpt`

<!--- Please provide details if the checkbox below is to be left
unchecked. -->
- [ ] I have added tests that prove my fix is effective or that my
feature works
<!--- 
User-facing changes require a CHANGELOG entry.
-->
- [ ] I have run `make changelog` and committed the
`changelog/pending/<file>` documenting my change
<!--
If the change(s) in this PR is a modification of an existing call to the
Pulumi Cloud,
then the service should honor older versions of the CLI where this
change would not exist.
You must then bump the API version in
/pkg/backend/httpstate/client/api.go, as well as add
it to the service.
-->
- [ ] Yes, there are changes in this PR that warrants bumping the Pulumi
Cloud API version
<!-- @pulumi employees: If yes, you must submit corresponding changes in
the service repo. -->
github-merge-queue bot pushed a commit to pulumi/pulumi that referenced this issue Dec 13, 2023
These tests are pretty close to hitting the timeout during regular runs
(e.g. one successful run I picked randomly took ~194000ms). The GitHub
action runners don't have super reliable performance, so we can easily
get pushed over this limit.

To make matters worse here, if we hit this timeout just at the right
time, the test doesn't exit cleanly (potentially related to
pulumi/pulumi-dotnet#134), as I've seen a
`pulumi preview` process stick around in that case, preventing the tests
from shutting down). This means we have to wait until the CI job times
out after an hour until the failure is reported.

I'm not sure if we only got close to this timeout recently, or if this
is a longer standing issue, but it reproduces well for me locally.
Ideally of course the tests would be faster, but this is still "only"
~5min which is much faster than other tests, and should hopefully reduce
the amount of times we need to go through the merge queue, saving a lot
of time there.

Fixes #14842

## Checklist

- [ ] I have run `make tidy` to update any new dependencies
- [ ] I have run `make lint` to verify my code passes the lint check
  - [ ] I have formatted my code using `gofumpt`

<!--- Please provide details if the checkbox below is to be left
unchecked. -->
- [ ] I have added tests that prove my fix is effective or that my
feature works
<!--- 
User-facing changes require a CHANGELOG entry.
-->
- [ ] I have run `make changelog` and committed the
`changelog/pending/<file>` documenting my change
<!--
If the change(s) in this PR is a modification of an existing call to the
Pulumi Cloud,
then the service should honor older versions of the CLI where this
change would not exist.
You must then bump the API version in
/pkg/backend/httpstate/client/api.go, as well as add
it to the service.
-->
- [ ] Yes, there are changes in this PR that warrants bumping the Pulumi
Cloud API version
<!-- @pulumi employees: If yes, you must submit corresponding changes in
the service repo. -->
@lukehoban lukehoban added kind/enhancement Improvements or new features and removed kind/bug Some behavior is incorrect or out of spec labels Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer/feedback Feedback from customers kind/enhancement Improvements or new features
Projects
None yet
Development

No branches or pull requests

3 participants