Quantcast
Channel: Proper way to do gradient clipping?
Viewing all articles
Browse latest Browse all 22

Proper way to do gradient clipping?

$
0
0

Quick question about this @apaszke , are the Variable.grad.data that we should pass to our clip function a part of the model object or (if we use a different optimizer) - the optimizer object?

In the sense, does optimizer itself call backward? In which case, the code below should pass optimizer to the clip function right?

        optimizer.zero_grad()
        output, hidden = model(data, hidden)
        loss = criterion(output.view(-1, ntokens), targets)
        loss.backward()
        clipped_lr = lr * clip_gradient(model, clip)
        for p in model.parameters():
            p.data.add_(-clipped_lr, p.grad.data)
            
        optimizer.step()

Read full topic


Viewing all articles
Browse latest Browse all 22

Trending Articles