Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupervised learning with self created dataset #58

Open
LA11131110128 opened this issue Sep 17, 2022 · 1 comment
Open

Unsupervised learning with self created dataset #58

LA11131110128 opened this issue Sep 17, 2022 · 1 comment

Comments

@LA11131110128
Copy link

LA11131110128 commented Sep 17, 2022

I have tried my dataset on your unsupervised learning framework, which num_of_edge will exceed 10^6.
When I load the data, there is an assertion error.


loading GCC 7.3.1
based on SCL Developer Toolset 7


loading CUDA 10.1 with cuDNN / NCCL
based on cntr cuda:10.1-cudnn7-devel-centos7

/pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [21,0,0], thread: [6,0,0] Assertion srcIndex < srcSelectDimSize failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [21,0,0], thread: [7,0,0] Assertion srcIndex < srcSelectDimSize failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [21,0,0], thread: [51,0,0] Assertion srcIndex < srcSelectDimSize failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [20,0,0], thread: [88,0,0] Assertion srcIndex < srcSelectDimSize failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [20,0,0], thread: [89,0,0] Assertion srcIndex < srcSelectDimSize failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [20,0,0], thread: [90,0,0] Assertion srcIndex < srcSelectDimSize failed.
Processing...
Done!
5264
1

lr: 0.01
num_features: 1
hidden_dim: 32
num_gc_layers: 4

dataset_num_classes: 7
Traceback (most recent call last):
File "gsimclr.py", line 189, in
emb, y = model.encoder.get_embeddings(dataloader_eval)
File "/home/u8411596/GraphCL-master/unsupervised_TU/gin.py", line 83, in get_embeddings
x, _ = self.forward(x, edge_index, batch)
File "/home/u8411596/GraphCL-master/unsupervised_TU/gin.py", line 56, in forward
x = F.relu(self.convs[i](x, edge_index))
File "/home/u8411596/.conda/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/u8411596/.conda/envs/py36/lib/python3.6/site-packages/torch_geometric/nn/conv/gin_conv.py", line 67, in forward
out += (1 + self.eps) * x_r
RuntimeError: CUDA error: device-side assert triggered

I am wondering the learning framework may have length of data limitation and want some suggestion from you to solve this problem.
Thank you!

@yyou1996
Copy link
Collaborator

Hi @LA11131110128,

It looks like the error comes from the mismatch between GNN and your customized data (though I am not clear where exactly it is). I would suggest to check the defined GIN architecture (input_node_dimension, etc) and confirming it matches your defined data.

Also maybe print out the shapes of x, edge_index to see whether the maximum edge index exceeds the node number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants