Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Dask + UMAP does not work with numpy array. #5893

Open
nahaharo opened this issue May 17, 2024 · 1 comment
Open

[BUG] Dask + UMAP does not work with numpy array. #5893

nahaharo opened this issue May 17, 2024 · 1 comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working

Comments

@nahaharo
Copy link

Describe the bug
When Using Dask + UMAP to use multiple gpus, if a input array is np.array not cupy array, then dask error raises.

ValueError: could not broadcast input array from shape (7,1) into shape (7,)

If I cast the input array into cupy array, it runs without error.

Below is the code.

from dask_cuda import LocalCUDACluster
from dask.distributed import Client
import dask.array as da
from cuml.manifold import UMAP
from cuml.dask.manifold import UMAP as MNMG_UMAP
import numpy as np
import cupy

if __name__ == "__main__":
    cluster = LocalCUDACluster(n_workers=2)
    client = Client(cluster)
    X = np.zeros((100, 10, 49), dtype=np.float32).reshape(100, -1)
    # X = cupy.asarray(X)
    print(X.shape, type(X))
    local_model = UMAP(random_state=10, n_components=1)
    val = local_model.fit_transform(X)

    distributed_model = MNMG_UMAP(model=local_model)
    distributed_X = da.from_array(X, chunks=(7, -1))
    embedding = distributed_model.transform(distributed_X)
    result = embedding.compute()
    client.close()
    cluster.close()

If I uncomment the "X = cupy.asarray(X)", then it runs without error.

  • Environment location: Docker
  • Linux Distro/Architecture: Ubuntu 20.04 amd64, kernel version=5.4.0-171-generic
  • GPU Model/Driver: 4 * RTX 3090, 550.76
  • CUDA: 12.2
  • Method of cuDF & cuML install: conda
    command: conda create -n rapids-24.04 -c rapidsai -c conda-forge -c nvidia rapids=24.04 python=3.11 cuda-version=12.2 h5py matplotlib
@nahaharo nahaharo added ? - Needs Triage Need team to review and classify bug Something isn't working labels May 17, 2024
@dantegd
Copy link
Member

dantegd commented May 22, 2024

Thanks for the issue @nahaharo, this is useful feedback, it's something in the backlog, we will add it in the future, but no ETA currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants