Implementation of the phasic policy gradient (PPG) algorithm for stable-baselines3.
The CNN policy with an auxiliary head is currently missing, so you can
only use the AuxMlpPolicy.
To initialize the policy with the paper's initialization values,
uncomment the code for init_weights in
./ppg/aux_ac_policy.py.