Quantcast
Viewing all articles
Browse latest Browse all 22

Proper way to do gradient clipping?

I have tested nn.LSTM against simple LSTM implementation and found almost no difference in the performance. Maybe I overestimated the overhead of the additional addition with simple guess. Thank you!

Read full topic


Viewing all articles
Browse latest Browse all 22

Trending Articles