I have tested in CPU and got no better results than just few milliseconds. (for someone who may try to implement LSTM for benchmarking Image may be NSFW.
Clik here to view. ) I think some more addition is insignificant than another expensive computations, like multiplication of weight matrices, nonlinear activation functions, or even python loop itself.
↧
Proper way to do gradient clipping?
↧