My bad, I thought what you suggest is that if you do gradient clipping, then you should (for some reason) use custom updates instead of optimizer.step()
. Now I got it, you meant that if you use custom updates, then you should not use optimizer.step()
(to avoid mixing custom and auto updates). Makes sense!
↧
Proper way to do gradient clipping?
↧