Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory #32

Open
BLCKEAGLE4 opened this issue Apr 8, 2021 · 2 comments
Open

CUDA out of memory #32

BLCKEAGLE4 opened this issue Apr 8, 2021 · 2 comments

Comments

@BLCKEAGLE4
Copy link

Hi, your work is very inspiring! I got errors when i run the train and then the test file. I don't know what I'm doing wrong. I would be grateful if anyone can help.

(venv) amperiad@cuda-pc:~/iSeeBetter-master$ python3 iSeeBetterTrain.py
[ INFO] ==> Loading datasets
Training samples chosen: foliage_test.txt
[ INFO] # of Generator parameters: 12771943
[ INFO] # of Discriminator parameters: 5215425
[ INFO] # of CUDA devices detected: 1
[ INFO] Using CUDA device #: 0
[ INFO] CUDA device name: GeForce GTX TITAN X
[ INFO] Generator Loss: L1 Loss
[ INFO] ------------- iSeeBetter Network Architecture -------------
[ INFO] ----------------- Generator Architecture ------------------
...

[ INFO] Total number of parameters: 5215425
[ INFO] -----------------------------------------------------------
0%| | 0/2 [00:00<?, ?it/s]Traceback (most recent call last):
File "iSeeBetterTrain.py", line 264, in
main()
File "iSeeBetterTrain.py", line 258, in main
runningResults = trainModel(epoch, training_data_loader, netG, netD, optimizerD, optimizerG, generatorCriterion, device, args)
File "iSeeBetterTrain.py", line 61, in trainModel
next(iterTrainBar)
File "/home/amperiad/venv/lib/python3.6/site-packages/tqdm/std.py", line 1087, in iter
for obj in iterable:
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in next
return self._process_data(data)
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/_utils.py", line 385, in reraise
raise self.exc_type(msg)
NotADirectoryError: Caught NotADirectoryError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/amperiad/iSeeBetter-master/dataset.py", line 204, in getitem
target, input, neigbor = load_img_future(self.image_filenames[index], self.nFrames, self.upscale_factor, self.other_dataset,self.upscale_only)
File "/home/amperiad/iSeeBetter-master/dataset.py", line 94, in load_img_future
target = modcrop(Image.open(join(filepath,'im4.png')).convert('RGB'),scale)
File "/usr/lib/python3/dist-packages/PIL/Image.py", line 2548, in open
fp = builtins.open(filename, "rb")
NotADirectoryError: [Errno 20] Not a directory: './Vid4/foliage/001.png/im4.png'

(venv) amperiad@cuda-pc:~/iSeeBetter-master$ python3 iSeeBetterTest.py -o output.txt -c --data_dir ./Vid4 --file_list foliage_test.txt -u
Namespace(chop_forward=False, data_dir='./Vid4', debug=False, file_list='foliage_test.txt', future_frame=True, gpu_mode=True, gpus=1, model='weights/netG_epoch_4_1.pth' , model_type='RBPN', nFrames=7, other_dataset=True, output='output.txt', residual=False, seed=123, testBatchSize=1, threads=1, upscale_factor=4, upscale_only=True)
Using GPU mode
==> Loading datasets
==> Building model RBPN
[ INFO] ------------- iSeeBetter Network Architecture -------------
[ INFO] ----------------- Generator Architecture ------------------
[ INFO] DataParallel(

...
[ INFO] Total number of parameters: 12771943
Pre-trained SR model loaded from: weights/netG_epoch_4_1.pth
Traceback (most recent call last):
File "iSeeBetterTest.py", line 197, in
eval()
File "iSeeBetterTest.py", line 107, in eval
prediction = model(input, neigbor, flow)
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/amperiad/iSeeBetter-master/rbpn.py", line 82, in forward
h0 = self.DBPN(feat_input)
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/amperiad/iSeeBetter-master/dbpns.py", line 55, in forward
x = self.output(torch.cat((h3, h2, h1),1))
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/amperiad/iSeeBetter-master/base_networks.py", line 66, in forward
out = self.conv(x)
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 345, in forward
return self.conv2d_forward(input, self.weight)
File "/home/amperiad/venv/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 342, in conv2d_forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDA out of memory. Tried to allocate 1.32 GiB (GPU 0; 11.92 GiB total capacity; 10.46 GiB already allocated; 635.38 MiB free; 280.01 MiB cached)

@AwaleSajil
Copy link

Same here

@sunyclj
Copy link

sunyclj commented Nov 29, 2021

i meet the same problem. i change '--gpus' , --gpus = 3, but only one gpu works,same mistakes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants