Quantcast
Channel: Proper way to do gradient clipping?
Viewing all articles
Browse latest Browse all 22

Proper way to do gradient clipping?

$
0
0

The one comes with nn.util clips in proportional to the magnitude of the gradients. Thus you’d like to make sure it is not too small for your particular model as Adam said (I think :p). The old-fashioned way of clipping/clampping is

def gradClamp(parameters, clip=5):
    for p in parameters:
        p.grad.data.clamp_(max=clip)

Read full topic


Viewing all articles
Browse latest Browse all 22

Trending Articles