乐闻世界logo
搜索文章和话题

How to set layer-wise learning rate in Tensorflow?

1个答案

1

Setting layer-wise learning rates in TensorFlow is a common technique, particularly useful when fine-tuning pre-trained models. Layer-wise learning rates allow us to set different learning rates for different parts of the model, typically using smaller learning rates for shallower layers and larger learning rates for deeper layers. This helps prevent over-modifying the pre-trained features during training while accelerating the training speed of newly added layers.

Here is a concrete implementation example:

  1. Define the Model: Suppose we use a pre-trained base model (e.g., VGG16) and add some custom layers on top.
python
import tensorflow as tf from tensorflow.keras.applications import VGG16 from tensorflow.keras.layers import Dense, Flatten from tensorflow.keras.models import Model # Load the pre-trained VGG16 model without the top fully connected layers base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3)) base_model.trainable = False # Freeze the base model's parameters # Add custom layers flatten = Flatten()(base_model.output) dense1 = Dense(256, activation='relu')(flatten) output = Dense(10, activation='softmax')(dense1) model = Model(inputs=base_model.input, outputs=output)
  1. Set Layer-wise Learning Rates: We can achieve this by defining multiple optimizers, each assigned to different parts of the model.
python
# Unfreeze some convolutional layers for layer in base_model.layers[-4:]: layer.trainable = True # Set different learning rates for different layers pre_train_lr = 1e-5 custom_lr = 1e-3 # Create optimizers for pre-trained layers and custom layers optimizer_pre_train = tf.keras.optimizers.Adam(learning_rate=pre_train_lr) optimizer_custom = tf.keras.optimizers.Adam(learning_rate=custom_lr) # Set the model's optimizer, loss function, and evaluation metrics model.compile(optimizer={'base_model': optimizer_pre_train, 'dense': optimizer_custom}, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
  1. Train the Model: Now you can train your model as usual.
python
# Assume we have training data train_images, train_labels model.fit(train_images, train_labels, batch_size=32, epochs=10)

In this example, we first load a pre-trained VGG16 model and add some custom fully connected layers on top. We freeze most pre-trained layers and unfreeze a small portion for fine-tuning. Then, we set different learning rates for the pre-trained layers and custom layers, and create corresponding optimizers to apply these learning rates. This layer-wise learning rate setting helps maintain the stability of pre-trained features during fine-tuning while quickly optimizing the weights of newly added layers.

2024年8月10日 14:28 回复

你的答案