Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat : Adds ggml_upscale_ext #814

Merged
merged 10 commits into from May 15, 2024
Merged

Conversation

balisujohn
Copy link
Contributor

@balisujohn balisujohn commented May 6, 2024

Closes issue #812

Adds ggml_upscale_ext alternate version of ggml_upscale which lets the user upscale a tensor to a specified shape using nearest neighbor interpolation.

Adds a CPU and cuda implementation, and test, though it's not clear to me either this or ggml_upscale has adequate tests.

Also, the reason I added this is because it is necessary to convert tortoise-tts to ggml.

This is unfinished since I still need to add the cuda implementation and possibly expand tests since I haven't verified it actually works yet. Will remove draft annotation when ready for review.

@balisujohn balisujohn marked this pull request as draft May 6, 2024 07:23
@balisujohn
Copy link
Contributor Author

(I also realize there are some unnecessary params in some of the functions, so I will clean that up before review)

@balisujohn balisujohn changed the title Draft PR adding ggml_upscale_to_shape Draft PR adding ggml_upscale_to_shape May 6, 2024
@balisujohn balisujohn changed the title Draft PR adding ggml_upscale_to_shape feat : Draft PR adding ggml_upscale_to_shape May 6, 2024
@balisujohn balisujohn changed the title feat : Draft PR adding ggml_upscale_to_shape feat : Adds ggml_upscale_to_shape May 8, 2024
@balisujohn
Copy link
Contributor Author

Ready for review! :^)

@balisujohn balisujohn marked this pull request as ready for review May 8, 2024 01:03
@@ -468,6 +468,7 @@ extern "C" {
GGML_OP_POOL_1D,
GGML_OP_POOL_2D,
GGML_OP_UPSCALE, // nearest interpolate
GGML_OP_UPSCALE_TO_SHAPE, // nearest interpolate to specified tensor shape
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to add extra OP, we can reuse the existing GGML_OP_UPSCALE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@ggerganov
Copy link
Owner

We should generalize the existing upscale OP and kernels since the existing ggml_upscale functionality is a special case of the more general ggml_upscale_to_shape

@balisujohn
Copy link
Contributor Author

We should generalize the existing upscale OP and kernels since the existing ggml_upscale functionality is a special case of the more general ggml_upscale_to_shape

I can do that. My thinking was to avoid breaking backwards compatibility, but I can just put the new behavior in the ggml_upscale op since it is a strict superset of ggml_upscales original behavior.

@ggerganov
Copy link
Owner

ggml_upscale can still do what it currently does in order to keep backwards compatibility. Take a look at ggml_soft_max and ggml_soft_max_ext as an example of what I have in mind

@balisujohn balisujohn changed the title feat : Adds ggml_upscale_to_shape feat : Adds ggml_upscale_ext May 12, 2024
@balisujohn
Copy link
Contributor Author

ready for review again :^)

@balisujohn
Copy link
Contributor Author

very impressive response time!

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation ignore half-pixel effects - I'm not sure how the reference PyTorch operators behave, but it's something that needs a deeper look

I also wonder if we should take the time to make the upscale operator more general to support downscaling - i.e. GGML_OP_RESCALE

Comment on lines 3 to 6
static __global__ void upscale_f32(const float * x, float * dst,
const int ne00, const int ne01, const int ne02, const int ne03,
const int ne10, const int ne11, const int ne12, const int ne13,
const float sf0, const float sf1, const float sf2, const float sf3) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the kernel to use the tensor strides - see the CPU and Metal implementations as an example. Otherwise, this would work only for contiguous data

When ready, try to see if you can add tests in test-backend-ops that exercise non-contiguous data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK done.

@balisujohn
Copy link
Contributor Author

This implementation ignore half-pixel effects - I'm not sure how the reference PyTorch operators behave, but it's something that needs a deeper look

I also wonder if we should take the time to make the upscale operator more general to support downscaling - i.e. GGML_OP_RESCALE

Alright so wrt behaving similarly to PyTorch nearest neighbor interpolation I can confirm

cur = ggml_upscale_ext(ctx0, cur, 187, 1024, 1,1);

behaves the same way (only tested with cuda backend) as

expanded_code_emb = F.interpolate(code_emb, size=187, mode='nearest')

for my internal test case tensor shaped [43,1024]. (dimensions read left to right)

I think it would be nice to support downscaling but I also think it might be good to leave that to a future PR.

Ready for review again :^)

@balisujohn
Copy link
Contributor Author

balisujohn commented May 12, 2024

Also is there somewhere inputs and output target literals are defined for the test cases? I found it a bit strange that I didn't have to specify the actual elementwise values of input literals and output target literals when defining the tests.

@balisujohn balisujohn requested a review from ggerganov May 14, 2024 05:43
@ggerganov
Copy link
Owner

The test-backend-ops tool does not provide mechanism for specific input / output values. You can write a standalone test, similar to test-conv1d.cpp for example. Would be useful to demonstrate that the numerical results match with the results from PyTorch.

Also, does PyTorch support floating-point scaling factor? If so, we should consider extending the functionality

@ggerganov ggerganov merged commit 126d349 into ggerganov:master May 15, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants