Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further testing for preserving tensor context with operations #503

Open
AtomicCactus opened this issue May 24, 2023 · 4 comments
Open

Further testing for preserving tensor context with operations #503

AtomicCactus opened this issue May 24, 2023 · 4 comments
Labels

Comments

@AtomicCactus
Copy link
Contributor

AtomicCactus commented May 24, 2023

Describe the bug

Decomposing a 2D tensor along both modes, while specifying two ranks results in an error because internally there's a tensor created on the CPU as part of the process, which cannot be concatenated with the GPU.

Works fine when the rank is specified as an integer, but not as a list:

  • rank=16 works
  • rank=[16,16] crashes

Works fine on the CPU, but performance is not the same.

Steps or Code to Reproduce

import torch
import tensorly as tl
tl.set_backend('pytorch')

x = torch.randn((72, 27)).cuda()
result = tl.decomposition.partial_tucker(x, rank=[16, 32], modes=[0, 1])

Expected behavior

Tucker decomposition should not fail when ranks are provided as an array of values.

Actual result

Traceback (most recent call last):
  File "/home/yuri/Source/tensorly_bug.py", line 6, in <module>
    result = tl.decomposition.partial_tucker(x, rank=[16, 32], modes=[0, 1])
  File "/home/yuri/.local/lib/python3.10/site-packages/tensorly/decomposition/_tucker.py", line 166, in partial_tucker
    core, factors = initialize_tucker(
  File "/home/yuri/.local/lib/python3.10/site-packages/tensorly/decomposition/_tucker.py", line 64, in initialize_tucker
    U, _, _ = svd_interface(
  File "/home/yuri/.local/lib/python3.10/site-packages/tensorly/tenalg/svd.py", line 426, in svd_interface
    U, V = svd_flip(U, V, u_based_decision=u_based_flip_sign)
  File "/home/yuri/.local/lib/python3.10/site-packages/tensorly/tenalg/svd.py", line 42, in svd_flip
    signs = tl.concatenate((signs, tl.ones(tl.shape(V)[0] - tl.shape(U)[1])))
  File "/home/yuri/.local/lib/python3.10/site-packages/tensorly/backend/__init__.py", line 206, in wrapped_backend_method
    return getattr(
  File "/home/yuri/.local/lib/python3.10/site-packages/tensorly/backend/pytorch_backend.py", line 153, in concatenate
    return torch.cat(tensors, dim=axis)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument tensors in method wrapper_CUDA_cat)

Versions

Linux-5.19.0-41-generic-x86_64-with-glibc2.35
Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0]
NumPy 1.22.4
SciPy 1.7.3
TensorLy 0.8.1
@AtomicCactus
Copy link
Contributor Author

I think we just need to make sure that on line #42 in svd.py the tl.ones(tl.shape(V)[0] - tl.shape(U)[1]) tensor is moved to the same device as the signs tensor:

Screenshot from 2023-05-23 22-20-51

@JeanKossaifi
Copy link
Member

Good catch thanks @AtomicCactus! Could you open a small PR to fix the issue?
Perhaps we could test this kind of issue, at least by changing the dtype of the input tensor in the future.

@AtomicCactus
Copy link
Contributor Author

Sure thing! #504
I didn't add a unit test for this, since I'm not sure if any CI/CD pipeline or other system that runs those has a GPU.

@JeanKossaifi
Copy link
Member

Thanks @AtomicCactus! I reviewed the PR -- for test we could look at dtype rather than device. Our current CI pipeline doesn't have GPU support.

@aarmey aarmey changed the title Partial Tucker with PyTorch backend fails on GPU Further testing for preserving tensor context with operations Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants