Nesterov momentum is different version of momentum update. It is also called Nesterov Accelerated Gradient.

In practice it works better than standard momentum (r*ead abaout standard Momentum here https://marko-kovacevic.com/blog/momentum-in-deep-learning/ ).*

The main idea is to look ahead before leap. If we know the velocity and direction of an object, we can predict its location in time T and calculate its gradient.

Instead of just blindly using momentum to keep going in the direction we were already going. Lets instead peek ahead by taking a big jump in the same direction of previous velocity and calculate the gradient from there. Then we use that gradient to update our velocity instead.

### Mathematical Implementation

x_ahead = x + mu * v# evaluate dx_ahead (the gradient at x_ahead instead of at x)v = mu * v - learning_rate * dx_ahead x = x - vx_ahead- weight that is look ahead x - weight dx_ahead - gradient ofx_aheadv - current velocity vector mu - momentum update

In practice people prefer to express the update to look as similar to vanilla Stochastic gradient descent or to the previous momentum update as possible.

Same formula but written to be similar as standard momentum:

v_prev = v # back this up v = mu * v - learning_rate * dx # velocity update stays the samex += -mu * v_prev + (1 + mu) * v# position update changes form

Weight update with standard momentum:

```
v = mu * v - learning_rate * dx
x = x - v
```

Vanilla weight update (without Momentum):

`x = x - learning_rate * dx`

### Programming Implementation

Keras:

`keras.optimizers.SGD(learning_rate=0.01, momentum=0.9, nesterov=True)`

Tensorflow:

```
tf.compat.v1.train.MomentumOptimizer(
learning_rate, momentum, use_locking=False, name='Momentum', use_nesterov=True
)
```

https://www.tensorflow.org/api_docs/python/tf/compat/v1/train/MomentumOptimizer

Thanks for reading this post.

### References

- Cs231n.github.io. 2020.
*Cs231n Convolutional Neural Networks For Visual Recognition*. [online] Available at: <https://cs231n.github.io/neural-networks-3/> [Accessed 20 April 2020].