-
Notifications
You must be signed in to change notification settings - Fork 672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error reproducing competition results #32
Comments
I have the same problem. Did you figure it out? |
|
I am trying to reproduce the competition results based on the instructions in the README.
I download and unzip the files from the kaggle competition into the
data/
folderI run the command
python make_features.py data/vars --add_days=63
which creates the following pickle files:2017-08-15_2017-09-11.pkl
,all.pkl
,train_2.pkl
and the directoryvars/
in thedata/
folderI run the trainer
python trainer.py --name s32 --hparam_set=s32 --n_models=3 --name s32 --no_eval --no_forward_split --asgd_decay=0.99 --max_steps=11500 --save_from_step=10500
and receive the following error:UnknownError (see above for traceback): CUDNN_STATUS_EXECUTION_FAILED in tensorflow/stream_executor/cuda/cuda_dnn.cc(944): 'cudnnSetDropoutDescriptor( handle.get(), cudnn.handle(), dropout, state_memory.opaque(), state_memory.size(), seed)'
I am using a p3.2xlarge AWS instance with the Deep Learning AMI with Python 3.6.5 and Tensorflow-gpu==1.12.0
If I downgrade to TF-GPU 1.10, I still get the same error.
How can I resolve this?
Full output from train command
The text was updated successfully, but these errors were encountered: