-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need help to understand the logic here #19579
Comments
Below is the same code from the keras example, I think you might have got confused between the multiple occurrence of x = layers.Dropout(0.5)(x)
decoder_outputs = layers.Dense(vocab_size, activation="softmax")(x)
decoder = keras.Model([decoder_inputs, encoded_seq_inputs], decoder_outputs)
decoder_outputs = decoder([decoder_inputs, encoder_outputs])
transformer = keras.Model(
[encoder_inputs, decoder_inputs], decoder_outputs, name="transformer"
) |
Hi @sachinprasadhs thanks for helping me understand more about it! But I still have a question, in the code below:
Why did we not build the "transformer" like this(above)? What is the benefit of using keras.Model() for encoder and decoder separately as written in the code(below)?
|
In the above code, you can comment out the line for encoder model which does not make any difference in the final outcome Here is another way of doing it mainly using subclassing models and layers https://www.tensorflow.org/text/tutorials/transformer |
This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you. |
Hi,
I have taken this code from "English-to-Spanish translation with a sequence-to-sequence Transformer" in Keras examples. I am unable to understand the reason for the code below.
Why did we not simply do this?(This code is taken from Deep Learning with Python Second Edition)
The text was updated successfully, but these errors were encountered: