# Learning Rate Schedules in Deep Learning Posted by

Learning rate schedules are techniques used to adjust Learning rate during training by predefined schedule.

See what is Learning rate here https://marko-kovacevic.com/blog/learning-rate-in-deep-learning/

Popular Learning rate schedules techniques are:

• Step decay
• Exponential decay
• Time decay

### 1. Step Decay

Step decay reduce the Learning rate by predefined value after predefined epochs.

Usually Learning rate is reduces by a half every 5 epochs or by 0.1 every 20 epochs. These numbers depend on the type of problem.

#### 1.1. Mathematical Implementation

```lr = lr0 * d^floor( (1 + t) / r)

lr - learning rate
lr0 - initial learning rate
d - decay parameter (how much the learning rate should change at each drop)
t - iteration number
r - how often the rate should be dropped (10 corresponds to a drop every 10 iterations)
```

#### 1.2. Programming Implementation

Keras:

``````def step_decay(epoch):
initial_lrate = 0.1
drop = 0.5
epochs_drop = 10.0
lrate = initial_lrate * math.pow(drop,
math.floor((1+epoch)/epochs_drop))
return lrate

lrate = LearningRateScheduler(step_decay)``````

https://keras.io/callbacks/#learningratescheduler

LearningRateScheduler is callback function that allows you freedom to define your custom Learning rate schedule.

With callbacks you can customize the behavior of a Keras model during training.

### 2. Exponential Decay

Exponential Decay reduce the Learning rate by folowing an exponential curve.

#### 2.1. Mathematical Implementation

```lr=lr0*e^(−d*t)

lr - learning rate
lr0 - initial learning rate
e - Euler's number (it is 2.71828)
d - decay parameter (how much the learning rate should change at each drop)
t - iteration number```

#### 2.2. Programming Implementation

Tensorflow:

``````tf.compat.v1.train.exponential_decay(
learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None
)``````

https://www.tensorflow.org/api_docs/python/tf/compat/v1/train/exponential_decay

Keras:

```tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate, decay_steps, decay_rate, staircase=False, name=None
)```

https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/schedules/ExponentialDecay

### 3. Time Decay

TIme decay reduce the Learning rate by time that was elapsed. Time is number of elapsed iterations.

Time decay is also called 1/t decay (Inverse time decay).

#### 3.1. Mathematical Implementation

```lr=lr0/(1+d*t)

lr - learning rate
lr0 - initial learning rate
d - decay parameter (how much the learning rate should change at each drop)
t - iteration number```

#### 3.2. Programming Implementation

Tensorflow:

```tf.compat.v1.train.inverse_time_decay(
learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None
)```

https://www.tensorflow.org/api_docs/python/tf/compat/v1/train/inverse_time_decay

Keras:

```tf.keras.optimizers.schedules.InverseTimeDecay(
initial_learning_rate, decay_steps, decay_rate, staircase=False, name=None
)```

https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/schedules/InverseTimeDecay

### Conclusion

Learning Rate Schedules are good techniques for updating Learning rate during training. It is better using them then using constant Learning rate (without updating Learning rate during training) because it will get faster minimum loss and it will have less chance to overshoot minimum loss.

Of all these Learning Rate Schedules techniques, most preferable is Step decay because it is easy to understand his parameters during training.