This repo contains the source code for the submitted paper "Your Learned Constraint is Secretly a Backwards Reachable Tube"
create an anaconda environment using the provided environment.yml
The repo contains two configurations in the ./config directory, one for each model. Throughout this tutorial dubins_model_id = 1 or 2
Compute the groundtruth BRT for the Dubins car (requires: https://github.com/StanfordASL/hj_reachability)
python brs_hjr.py --dubins_model_id <1 or 2>
We include two datasets of expert trajectories for model 1 and model 2. You can use these trajectories to compute the constraint.
- Given the expert trajectories and no knowledge of the failure set, train MT-ICL to recover the constraint.
python network_init.py --dubins_model_id <1 or 2>
python planner_icl.py --dubins_model_id <1 or 2>
After each outer epoch, the constraint model weights will be saved
- Visualize the learned constraint
python visualize_constraint.py --dubins_model_id <1 or 2> --epoch <outer iter>
Where epoch is an integer indicating at what outer epoch we want to visualize the constraint.
If you want to retrain from scratch (i.e. train expert policies, expert rollouts, and then MT-ICL), follow the following steps:
- Train an expert policy using PPO given knowledge of the failure set:
python dubins.py --dubins_model_id <1 or 2>
- Generate expert rollouts:
python dubins_rollout.py --dubins_model_id <1 or 2>
- Follow steps 2 and 3 from the section above