How to set layer-wise learning rate in Tensorflow?

Setting layer-wise learning rates in TensorFlow is a common technique, particularly useful when fine-tuning pre-trained models. Layer-wise learning rates allow us to set different learning rates for different parts of the model, typically using smaller learning rates for shallower layers and larger learning rates for deeper layers. This helps prevent over-modifying the pre-trained features during training while accelerating the training speed of newly added layers.

Here is a concrete implementation example:

Define the Model: Suppose we use a pre-trained base model (e.g., VGG16) and add some custom layers on top.

python
import tensorflow as tf
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Model

# Load the pre-trained VGG16 model without the top fully connected layers
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
base_model.trainable = False  # Freeze the base model's parameters

# Add custom layers
flatten = Flatten()(base_model.output)
dense1 = Dense(256, activation='relu')(flatten)
output = Dense(10, activation='softmax')(dense1)

model = Model(inputs=base_model.input, outputs=output)

Set Layer-wise Learning Rates: We can achieve this by defining multiple optimizers, each assigned to different parts of the model.

python
# Unfreeze some convolutional layers
for layer in base_model.layers[-4:]:
    layer.trainable = True

# Set different learning rates for different layers
pre_train_lr = 1e-5
custom_lr = 1e-3

# Create optimizers for pre-trained layers and custom layers
optimizer_pre_train = tf.keras.optimizers.Adam(learning_rate=pre_train_lr)
optimizer_custom = tf.keras.optimizers.Adam(learning_rate=custom_lr)

# Set the model's optimizer, loss function, and evaluation metrics
model.compile(optimizer={'base_model': optimizer_pre_train, 'dense': optimizer_custom},
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Train the Model: Now you can train your model as usual.

python
# Assume we have training data train_images, train_labels
model.fit(train_images, train_labels, batch_size=32, epochs=10)

In this example, we first load a pre-trained VGG16 model and add some custom fully connected layers on top. We freeze most pre-trained layers and unfreeze a small portion for fine-tuning. Then, we set different learning rates for the pre-trained layers and custom layers, and create corresponding optimizers to apply these learning rates. This layer-wise learning rate setting helps maintain the stability of pre-trained features during fine-tuning while quickly optimizing the weights of newly added layers.

2024年8月10日 14:28 回复

1个答案

你的答案