Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destruction of Algorithm that is presently running violates Vulkan spec - (Presently theoretical) #214

Open
20kdc opened this issue May 3, 2021 · 2 comments
Labels
bug Something isn't working
Projects

Comments

@20kdc
Copy link
Contributor

20kdc commented May 3, 2021

Algorithm's destroy function destroys:

  • A Pipeline
  • A DescriptorPool
  • A PipelineLayout

without waiting for completion.

There are other objects destroyed which may also cause errors in similar ways.

vkDestroyPipeline: VUID-vkDestroyPipeline-pipeline-00765: All submitted commands that refer to pipeline must have completed execution

vkDestroyDescriptorPool: VUID-vkDestroyDescriptorPool-descriptorPool-00303: All submitted commands that refer to descriptorPool (via any allocated descriptor sets) must have completed execution

There is also the theoretical potential for: vkDestroyPipelineLayout: VUID-vkDestroyPipelineLayout-pipelineLayout-02004: pipelineLayout must not have been passed to any vkCmd* command for any command buffers that are still in the recording state when vkDestroyPipelineLayout is called

@axsaucedo axsaucedo added this to To do in 0.8.0 via automation May 3, 2021
@axsaucedo axsaucedo added the bug Something isn't working label May 3, 2021
@axsaucedo
Copy link
Member

Ok it seems that this test:

TEST(TestAsyncOperations, TestManagerAsyncExecutionDestroyDescriptors)
{
    {
        uint32_t size = 10;

        std::string shader(R"(
            #version 450

            layout (local_size_x = 1) in;

            layout(set = 0, binding = 0) buffer b { float pb[]; };

            shared uint sharedTotal[1];

            void main() {
                uint index = gl_GlobalInvocationID.x;

                sharedTotal[0] = 0;

                for (int i = 0; i < 100000000; i++)
                {
                    atomicAdd(sharedTotal[0], 1);
                }

                pb[index] = sharedTotal[0];
            }
        )");

        std::vector<uint32_t> spirv = kp::Shader::compileSource(shader);

        std::vector<float> data(size, 0.0);
        std::vector<float> resultAsync(size, 100000000);

        kp::Manager mgr;

        std::shared_ptr<kp::TensorT<float>> tensorA = mgr.tensor(data);
        std::shared_ptr<kp::TensorT<float>> tensorB = mgr.tensor(data);

        std::shared_ptr<kp::Sequence> sq1 = mgr.sequence();
        std::shared_ptr<kp::Sequence> sq2 = mgr.sequence();

        sq1->eval<kp::OpTensorSyncLocal>({ tensorA, tensorB });

        std::shared_ptr<kp::Algorithm> algo1 = mgr.algorithm({ tensorA }, spirv);
        std::shared_ptr<kp::Algorithm> algo2 = mgr.algorithm({ tensorB }, spirv);

        // AMD Drivers in Windows may see an error in this line due to timeout.
        // In order to fix this, it requires a change on Windows registries.
        // More details on this can be found here: https://docs.substance3d.com/spdoc/gpu-drivers-crash-with-long-computations-128745489.html
        // Context on solution discussed in github: https://github.com/EthicalML/vulkan-kompute/issues/196#issuecomment-808866505
        sq1->evalAsync<kp::OpAlgoDispatch>(algo1);
        sq2->evalAsync<kp::OpAlgoDispatch>(algo2);
    }
}

Can recreate the following validation violations:

[2021-05-03 17:14:22.963] [debug] [Sequence.cpp:28] Kompute Sequence Destructor started
[2021-05-03 17:14:22.963] [debug] [Sequence.cpp:208] Kompute Sequence destroy called
[2021-05-03 17:14:22.963] [info] [Sequence.cpp:217] Freeing CommandBuffer
[2021-05-03 17:14:22.964] [debug] [Manager.cpp:25] [VALIDATION]: Validation - Validation Error: [ VUID-vkFreeCommandBuffers-pCommandBuffers-00047 ] Object 0: handle = 0x55f527277820, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x1ab902fc | Attempt to free VkCommandBuffer 0x55f527277820[] which is in use. The Vulkan spec states: All elements of pCommandBuffers must not be in the pending state (https://vulkan.lunarg.com/doc/view/1.2.148.0/linux/1.2-extensions/vkspec.html#VUID-vkFreeCommandBuffers-pCommandBuffers-00047)
[2021-05-03 17:14:22.964] [debug] [Sequence.cpp:229] Kompute Sequence Freed CommandBuffer
[2021-05-03 17:14:22.964] [info] [Sequence.cpp:233] Destroying CommandPool
[2021-05-03 17:14:22.964] [debug] [Sequence.cpp:246] Kompute Sequence Destroyed CommandPool
[2021-05-03 17:14:22.964] [info] [Sequence.cpp:250] Kompute Sequence clearing operations buffer
[2021-05-03 17:14:22.964] [debug] [OpAlgoDispatch.cpp:18] Kompute OpAlgoDispatch destructor started
[2021-05-03 17:14:22.964] [debug] [Algorithm.cpp:33] Kompute Algorithm Destructor started
[2021-05-03 17:14:22.964] [debug] [Algorithm.cpp:84] Kompute Algorithm Destroying pipeline
[2021-05-03 17:14:22.964] [debug] [Manager.cpp:25] [VALIDATION]: Validation - Validation Error: [ VUID-vkDestroyPipeline-pipeline-00765 ] Object 0: handle = 0x55f527353328, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x6bdce5fd | Cannot call vkDestroyPipeline on VkPipeline 0x1a000000001a[] that is currently in use by a command buffer. The Vulkan spec states: All submitted commands that refer to pipeline must have completed execution (https://vulkan.lunarg.com/doc/view/1.2.148.0/linux/1.2-extensions/vkspec.html#VUID-vkDestroyPipeline-pipeline-00765)
[2021-05-03 17:14:22.964] [debug] [Algorithm.cpp:96] Kompute Algorithm Destroying pipeline cache
[2021-05-03 17:14:22.965] [debug] [Algorithm.cpp:108] Kompute Algorithm Destroying pipeline layout
[2021-05-03 17:14:22.965] [debug] [Algorithm.cpp:120] Kompute Algorithm Destroying shader module
[2021-05-03 17:14:22.965] [debug] [Algorithm.cpp:146] Kompute Algorithm Destroying Descriptor Set Layout
[2021-05-03 17:14:22.965] [debug] [Algorithm.cpp:158] Kompute Algorithm Destroying Descriptor Pool
[2021-05-03 17:14:22.965] [debug] [Manager.cpp:25] [VALIDATION]: Validation - Validation Error: [ VUID-vkDestroyDescriptorPool-descriptorPool-00303 ] Object 0: handle = 0x55f527353328, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x4dad1ae8 | Cannot call vkDestroyDescriptorPool on VkDescriptorPool 0x140000000014[] that is currently in use by a command buffer. The Vulkan spec states: All submitted commands that refer to descriptorPool (via any allocated descriptor sets) must have completed execution (https://vulkan.lunarg.com/doc/view/1.2.148.0/linux/1.2-extensions/vkspec.html#VUID-vkDestroyDescriptorPool-descriptorPool-00303)
[2021-05-03 17:14:22.965] [debug] [OpBase.hpp:28] Kompute OpBase destructor started
[2021-05-03 17:14:22.965] [debug] [Sequence.cpp:28] Kompute Sequence Destructor started
[2021-05-03 17:14:22.965] [debug] [Sequence.cpp:208] Kompute Sequence destroy called
[2021-05-03 17:14:22.966] [info] [Sequence.cpp:217] Freeing CommandBuffer
[2021-05-03 17:14:22.966] [debug] [Manager.cpp:25] [VALIDATION]: Validation - Validation Error: [ VUID-vkFreeCommandBuffers-pCommandBuffers-00047 ] Object 0: handle = 0x55f527275a30, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x1ab902fc | Attempt to free VkCommandBuffer 0x55f527275a30[] which is in use. The Vulkan spec states: All elements of pCommandBuffers must not be in the pending state (https://vulkan.lunarg.com/doc/view/1.2.148.0/linux/1.2-extensions/vkspec.html#VUID-vkFreeCommandBuffers-pCommandBuffers-00047)
[2021-05-03 17:14:22.966] [debug] [Sequence.cpp:229] Kompute Sequence Freed CommandBuffer
[2021-05-03 17:14:22.966] [info] [Sequence.cpp:233] Destroying CommandPool
[2021-05-03 17:14:22.966] [debug] [Sequence.cpp:246] Kompute Sequence Destroyed CommandPool
[2021-05-03 17:14:22.966] [info] [Sequence.cpp:250] Kompute Sequence clearing operations buffer
[2021-05-03 17:14:22.966] [debug] [OpTensorSyncLocal.cpp:23] Kompute OpTensorSyncLocal destructor started
[2021-05-03 17:14:22.966] [debug] [OpBase.hpp:28] Kompute OpBase destructor started
[2021-05-03 17:14:22.966] [debug] [OpAlgoDispatch.cpp:18] Kompute OpAlgoDispatch destructor started
[2021-05-03 17:14:22.966] [debug] [Algorithm.cpp:33] Kompute Algorithm Destructor started
[2021-05-03 17:14:22.966] [debug] [Algorithm.cpp:84] Kompute Algorithm Destroying pipeline
[2021-05-03 17:14:22.967] [debug] [Manager.cpp:25] [VALIDATION]: Validation - Validation Error: [ VUID-vkDestroyPipeline-pipeline-00765 ] Object 0: handle = 0x55f527353328, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x6bdce5fd | Cannot call vkDestroyPipeline on VkPipeline 0x130000000013[] that is currently in use by a command buffer. The Vulkan spec states: All submitted commands that refer to pipeline must have completed execution (https://vulkan.lunarg.com/doc/view/1.2.148.0/linux/1.2-extensions/vkspec.html#VUID-vkDestroyPipeline-pipeline-00765)
[2021-05-03 17:14:22.968] [debug] [Algorithm.cpp:96] Kompute Algorithm Destroying pipeline cache
Makefile:90: recipe for target 'mk_run_tests' failed
make: *** [mk_run_tests] Segmentation fault

@20kdc
Copy link
Contributor Author

20kdc commented May 3, 2021

Yes, VUID-vkDestroyPipelineLayout-pipelineLayout-02004 is only if someone left a sequence still recording

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
0.8.0
To do
Development

No branches or pull requests

2 participants