-
Notifications
You must be signed in to change notification settings - Fork 951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggest: batchsize dimension should always be not broadcastable even if batchsize==1 #873
Comments
Sorry for the delay. Why would you want this? There could be use cases for a broadcastable batch dimension. |
@f0k Consider this case: We training a model using batchsize > 1, and inference using batchsize = 1. This will break the code using the output of a layer which has shape (batchsize,) as a vector. When inference it is not a vector because dim 0 is breadcastable, but when training it is vector. |
Which layer breaks when it sees a vector of broadcast pattern |
@f0k I got your point, but leaving the batchsize unspecified as None looks like workaround tips, instead of designed to be. Follow the documents of lasagne, If I want better performance and I know the batchsize, I should provide it. If we treat the documents as representation of the design, I think the impliment should support the direct thinking from reading the documents. Maybe broadcastable batch dimension is useful in some case, but nobody know it if not look at the code. While, thinking batchsize==1 would act similar to batchsize>1 is reasonable if only look at the documents. btw, The detail of my case, I passed classify target of shape (batchsize,) through a InputLayer, and call categorical_crossentropy on the output of the layer. So no lasagne code breaks. |
Yes, I agree. We want it to work better or equally well when providing the batchsize, not worse.
I still don't understand what breaks -- it shouldn't make a difference whether it's broadcastable or not? |
@f0k When calling lasagne.objectives.categorical_crossentropy, theano would raise a TypeError: integer vector required for argument: true_one_of_n(got type: TensorType(int64, (True,)) instead of: TensorType(int64, vector))
following patch was applied to deepstacks/deepstacks/framework/main.py to produce similar behaviour of lasagne.layers.InputLayer.
|
Hmm, this should either be changed in Theano's or Lasagne's |
I agree |
Following code in lasagne/layers/input.py:
would best to be changed to:
The text was updated successfully, but these errors were encountered: