In target detection model training, we usually have a feature extraction network backbone, such as darknet SSD using VGG-16 in YOLO.
In order to achieve better training results, the pre-trained backbone model parameters are often loaded, and then the detection network is trained on this basis, and the backbone is fine-tuned, which requires a small lr for the backbone.
class net(): def __init__(self): super(net, self).__init__() # backbone = ... # detect self....
When setting up the optimizer, it is only necessary to divide the parameters into two parts and give different learning rates lr for each.
base_params = list(map(id, ())) logits_params = filter(lambda p: id(p) not in base_params, ()) params = [ {"params": logits_params, "lr": }, {"params": (), "lr": config.backbone_lr}, ] optimizer = (params, momentum=, weight_decay=config.weight_decay)
Above this pytorch implementation model different layers set different learning rate way is all I share with you, I hope to give you a reference, and I hope you support me more.