Quantcast
Channel: Proper way to do gradient clipping?
Viewing all articles
Browse latest Browse all 23

Proper way to do gradient clipping?

$
0
0

I have tested in CPU and got no better results than just few milliseconds. (for someone who may try to implement LSTM for benchmarking :slight_smile: ) I think some more addition is insignificant than another expensive computations, like multiplication of weight matrices, nonlinear activation functions, or even python loop itself.

Read full topic


Viewing all articles
Browse latest Browse all 23

Trending Articles