Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot create tensor without initialization data #295

Open
cracy3m opened this issue Aug 4, 2022 · 11 comments
Open

Cannot create tensor without initialization data #295

cracy3m opened this issue Aug 4, 2022 · 11 comments

Comments

@cracy3m
Copy link

cracy3m commented Aug 4, 2022

Is there a way to create a Tensor without a initialization data?
The data arg of kp::Tensor () function set to nullptr will occur exception (alway use memcpy(..., data, ...) inside the function)!

kp::Tensor (
std::shared_ptr< vk::PhysicalDevice > physicalDevice,
std::shared_ptr< vk::Device > device,
void *data,
uint32_t elementTotalCount, uint32_t elementMemorySize,
const TensorDataTypes &dataType,
const TensorTypes &tensorType=TensorTypes::eDevice
)

@axsaucedo
Copy link
Member

@cracy3m thank you for reporting, the current expected behavious is actually to avoid tensors being created without data, you can look at this test through TestOpTensorCreate.ExceptionOnZeroSizeTensor, mainly as it actually creates the underlying Vulkan resources when these get initialised. Is there a reason why you want to create the tensor without any data?

@cracy3m
Copy link
Author

cracy3m commented Aug 4, 2022

  1. Thanks to reply. So I have to create Tensor object with data.
  2. I just want to wrap kompute as a function called by others to do some calculations, so I don't need or know the input data. I think it is troublesome for each Tensor to create zero value data for initialization.

In addition, why not configure whether the validation layer of Vulkan instance is enable through variables or functions? I want to build a dynamic link library, so that users can control whether the function is turned on even in the release version

@axsaucedo
Copy link
Member

I just want to wrap kompute as a function called by others to do some calculations, so I don't need or know the input data. I think it is troublesome for each Tensor to create zero value data for initialization.

I'm not sure I understand this use-case, Tensors are currently supposed to be seen as containers of Vulkan GPU-memory data as opposed to containers of CPU-memory data, which means that it assumes you may want to have a separate way to handle your CPU-memory data. Can you provide more details on why this is not the case? It seems it may just be expected to be used in a way that it's not intended (at least at this point)

In addition, why not configure whether the validation layer of Vulkan instance is enable through variables or functions? I want to build a dynamic link library, so that users can control whether the function is turned on even in the release version

I'm not sure what you mean by this, as currently it's possible to build the debug variables with build parameters, irrespective of whether you build against a dynamic or static library. Can you clairfy what you mean and how the build parameters don't fullfill this?

@cracy3m
Copy link
Author

cracy3m commented Aug 5, 2022

I'm not sure I understand this use-case, Tensors are currently supposed to be seen as containers of Vulkan GPU-memory data as opposed to containers of CPU-memory data, which means that it assumes you may want to have a separate way to handle your CPU-memory data. Can you provide more details on why this is not the case? It seems it may just be expected to be used in a way that it's not intended (at least at this point)


Sorry, my English is not very good, so I used the translator.
I just think it's a little troublesome to create a zero value buffer every time when create a kp:: Tensor object.


My use-case:

  1. I know the input data size ( For example, input 1000x512 uint16_t buffer to do WinFunc(like hanning) -> FFT -> ...)
  2. I do not know the input data value, because it is generate at runtime
    In this case, I must to create Tensors with zero value buffer, it's a little troublesome

I'm not sure what you mean by this, as currently it's possible to build the debug variables with build parameters, irrespective of whether you build against a dynamic or static library. Can you clairfy what you mean and how the build parameters don't fullfill this?

Will this statement be clearer :

  1. Build my code to a DLL(lib) file (only release versoin, use kompute source to some fixed calculations)
  2. Send the DLL(lib) file to other people (or environment) for use.
  3. If some some thing go wrong, I think it would help to enable vukan validation layer throught config file(xml, json ...)

It is also possible that my idea is incorrect, because if there is no error in Vulkan validation layer during development, there will be no error in other use environments. Or I must provide a debug version of the DLL
These are not very important. I can also create Vulkan instances to realize such functions
Thank you.

@axsaucedo
Copy link
Member

It is also possible that my idea is incorrect, because if there is no error in Vulkan validation layer during development, there will be no error in other use environments. Or I must provide a debug version of the DLL
These are not very important. I can also create Vulkan instances to realize such functions
Thank you.

Hmm yeah it does sound like it shouldn't really matter whether you are linking dynamically or statically. The standard best practice is to only use validation layers during development, and then once you build for production you would remove these validation layers as otherwise you would be adding quite a lot of overhead.

My use-case:

Ok I see, I do have to take a deeper look at this, as currently the idea of the Tensors is to become a container for GPU data as opposed to CPU data. I will have to take a deeper look at whether indeed we would be looking at having this, as ultimately we are looking to take the opposite way, in which no data at all is stored in RAM and all data is stored in GPU memory.

The challenge of allowing for non-initalisation of Tensors is that this would allow for optional initialisation of vulkan components to a later stage, which adds overhead complexity, as opposed to now which makes it such that a tensor will not always have initialised vulkan components.

Given it would add quite a lot of complexity, I am inclined to explore more efficient ways in which this could be handled at the application level on your side unless there is a simpler way of supporting this, let me have a look to see if indeed it's a relatively simple and reasonable change.

@axsaucedo
Copy link
Member

I just had a deeper look and it does seem like there's not a trivial way in which this would be achieved as there's no concept of "uninitialised components", as that was initially explored but as a design decision it was decided that kompute resources are created together with the GPU resources and destroyed once. Based on this we'll have to pass on this feature for the time being, but I will leave this issue open as we will be planning to do a large refactor improvement in the near term and can consider then whether this is a feature we would be looking at adding.

@cracy3m
Copy link
Author

cracy3m commented Aug 9, 2022

I just had a deeper look and it does seem like there's not a trivial way in which this would be achieved as there's no concept of "uninitialised components", as that was initially explored but as a design decision it was decided that kompute resources are created together with the GPU resources and destroyed once. Based on this we'll have to pass on this feature for the time being, but I will leave this issue open as we will be planning to do a large refactor improvement in the near term and can consider then whether this is a feature we would be looking at adding.

emm.., Maybe what I want is simple:

//V0.8.1
Tensor::rebuild(void* data,
                uint32_t elementTotalCount,
                uint32_t elementMemorySize)
{
    KP_LOG_DEBUG("Kompute Tensor rebuilding with size {}", elementTotalCount);

    this->mSize = elementTotalCount;
    this->mDataTypeMemorySize = elementMemorySize;

    if (this->mPrimaryBuffer || this->mPrimaryMemory) {
        KP_LOG_DEBUG(
          "Kompute Tensor destroying existing resources before rebuild");
        this->destroy();
    }

    this->allocateMemoryCreateGPUResources();
    this->mapRawData();

    memcpy(this->mRawData, data, this->memorySize()); // <-- this line
}

memcpy(this->mRawData, data, this->memorySize()); // change to :
if (data) memcpy(this->mRawData, data, this->memorySize());

I also have some doubts about that, is it appropriate keep mapping staging buffer since Tensor::rebuild() until ~Tensor().
I think it may occupy cpu address resources, although my app will not create many tensors object.

@axsaucedo
Copy link
Member

@cracy3m I did some initial tests to validate if the change would be as straightforward as this, however there would be quite a lot of things required, namely as currently the codebase has some checks such as throwing exceptions if the provided data type is larger than the initially confgiured one. The rebuild would also call destroy and hence wouldn't allow for this to be used outside of the manager. Overall this is why I mentioned above that it wouldn't be just a trivial change, and may be considered for a larger refactor down the line.

@cracy3m
Copy link
Author

cracy3m commented Aug 9, 2022

@cracy3m I did some initial tests to validate if the change would be as straightforward as this, however there would be quite a lot of things required, namely as currently the codebase has some checks such as throwing exceptions if the provided data type is larger than the initially confgiured one. The rebuild would also call destroy and hence wouldn't allow for this to be used outside of the manager. Overall this is why I mentioned above that it wouldn't be just a trivial change, and may be considered for a larger refactor down the line.

Thank you reply me again.

I got it. Only consider Tensortypes::eDevice type without initializing the staging buffer, which may cause other problems in using tensor class.

@mirsadm
Copy link

mirsadm commented Feb 2, 2023

Adding to the discussion, it would be helpful to be able to pass a nullptr when setting up a tensor that is used for output. One use case is when working with real time applications where CPU/GPU memory is shared. It is not necessary to copy between CPU/GPU memory when the buffer is initialised. This is useful on Android in particular where the memory copy is quite often slower than the compute time.

@wdx04
Copy link

wdx04 commented Feb 20, 2024

It's very common for an algotithm to use output-only buffers/tensors.
In SYCL we can declare a buffer accessor as write_only, thus eliminating unnecessary data movement from host to device.
Also in SYCL the host memory used to construct a buffer can be automatically synchronized from device after computation is done.
It's a surprise to me that Kompute can't do either of these things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants