-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HAN的attention里为什么加reduce_sum和reduce_max? #139
Comments
This is because of the fact that softmax is shift-invariant by a constant offset in the input.
Deducing the maximum value in the attention_logits allows a faster and more stable numerical computation. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
在HAN的attention里面看到:
attetion_logits = tf.reduce_sum(hidden_state_context_similarity,axis = 2) attention_logits_max = tf.reduce_max(attention_logits, axis = 1,keep_dims = True) p_attention = tf.nn.softmax(attetion_logits-attention_logits_max)
原论文里没看到这个操作,请问这是为什么呢?
The text was updated successfully, but these errors were encountered: