Question about the optimizer

Hello! I have some questions about the optimizer used in your training. The paper mentions, "We use an Adam optimizer [47] for all the experiments." However, I noticed that the code uses the AdamW optimizer instead, as stated in the training configuration `.yaml` file with the line `optim: adamw_torch`. Could you please clarify this?