CognitiveRobotics_Robo_Control

Introduction

Welcome to the CognitiveRobotics_Robo_Control repository!

This repository contains the code developed for the final project of the course Cognitive Robotics offered at the University of Groningen during the academic year 2019-2020.

The goal of the project can be summarized as follows:

Trying to come up with an innovative robotic arm trajectory generating controller.

More precisely, the attempt was made to to train an Reinforcement Learning (RL) agent to control a robotic arm in joint space without having to perform possibly expensive Inverse Kinematics operations. The goal of the controller is to compute changes to be applied to the current joint angles of all controlled joints in order to make the robotic arm attain a given goal location and to have the fingers of the arm's end-effector's gripper point towards the goal location. The RL agent was only given the following information:

The goal position in Cartesian space
Its end-effector's Center of Mass (COM) Cartesian position
The normalized vector expressing the orientation of the end-effector's fingers
The normalized vector pointing from the end-effector's COM towards the goal location
The set of the robot's current joint angles in radians

All measurements were taken with respect to a universal coordinate system. As a reward signal, different reward functions were tested against each other. More about the theoretical background can be found in the attached report (which is currently not yet available due to work in progress behind the scenes).

In order to achieve this goal, the physics simulation software Pybullet is employed, in which a Franka Emika Panda, i.e. a robotic arm, is simulated. The simulated arm is controlled by a Proximal Policy Optimization (PPO) Reinforcement Learning agent, where the implementation of the PPO algorithm is provided by Stable-Baselines. A customized Gym environment, called PandaRobotEnv and defined in customRobotEnv.py, acts as a bridge between the Pybullet simulation and the PPO agent. The model of the robotic arm is provided by pybullet_robots. For designing the aforementioned PandaRobotEnv, the KukaGymEnv, included in this repository as a reference environment and originally shipped with the Pybullet installation, served as inspiration for designing some core functions. However, the used Gym environment's functionality has been thoroughly redesigned and augmented in order to meet our custom goals and to be compatible with both Stable-Baselines' PPO implementation and the Franka Emika Panda.

This repository contains functionality to train PPO agents on controlling a Franka Emika Panda in joint using different reward functions and modes as well as to both visually render and record the performance of trained PPO agents performing their assigned task.

Furthermore, drawing upon a separate repository devoted to the evaluation of this project, which is included as a Git-submodule, the repository contains a set of trained agents, the evaluation of their training outcomes, and the functionality used to perform the evaluation.

An example video showing the evolution of the training progress of one trained PPO agent can be found on YouTube.

Using the repository

Note: The repository has been set up using Python 3.

Installation/Setup

Software needed for running the code used in this project can be installed using pip as follows:

pip install tensorflow pip install pybullet pip install stable-baselines pip install argparse

For recording videos of trained agents, an extra software is needed. Under Ubuntu, it can be installed by the following command:

sudo apt-get install ffmpeg

To load the included submodules containing the robot models, trained models, and evaluation data, one has to manually load them by executing the following command when loading them for the first time:

git submodule update --init --recursive

To get updated versions of the submodules at some later point, call:

git submodule update --recursive --remote

Note: In case of problems, check out StackOverflow

Instructions

In the following, the separate functionalities are are quickly introduced.

Training

For training a new or existing PPO agent, the main.py file can be used. The file takes two optional arguments when being started:

-p: A path to a json-file containing parameter specifications to be used for training.
-r: A path to a trained model which is supposed to be loaded for the continuation of its training. When continuing training, a new folder will be created and counting of weight updates starts at 0 again. However, the trained model is used and the path to the read-in model will be recoded in the documentations of used parameters, which are stored in both params.csv and params.json

For the training process, a folder Results/models_unique_folder is created in the repo, where models_unique_folder is a unique identifier for each model and the folder contains all data associated with the training process of the model. Checkpoints will be saved there, as well as documentation files etc.

Example: Starting training a new agent with parameter settings specified in the file params_6.json: python3 main.py -p ParameterSettings/params_6.json

Example: Starting training a new agent with default parameter settings: python3 main.py

Replaying

A trained model can be visually inspected using the run_trained_model.py file. Starting the file, 0, 1 or 2 arguments can be provided.

Example: Observe how a given trained default model performs:

python3 run_trained_model.py

Example: Run a specific model provided to the code as an argument:

python3 run_trained_model.py Evaluation_CognitiveRobotics_Robo_Control/Results/PPO2/PandaController_2019_08_11__15_41_05__262730fzyxnprhgl/final_model.zip

Example: Run a specific model provided to the code as a first(!) argument for 1000 time steps given as a second(!) argument:

python3 run_trained_model.py Evaluation_CognitiveRobotics_Robo_Control/Results/PPO2/PandaController_2019_08_11__15_41_05__262730fzyxnprhgl/final_model.zip 1000

Note: In case that the data cannot be found, make sure to load the submodules (as described above).

Recording video sequence

record_video_of_performing_trained_model.py is the file to record video sequences of a trained agent. It will create the file structure VideoRecordings/model_name/Recording_date_some_info.mp4.

It can be called without any arguments to record videos of a default model. Alternatively, it can also be called given an argument, which is supposed to be a path to a trained model.

Example: Record a video sequence of a default model:

python3 record_video_of_performing_trained_model.py

Example: Record a video sequence of a specific model:

python3 record_video_of_performing_trained_model.py Evaluation_CognitiveRobotics_Robo_Control/Results/PPO2/PandaController_2019_08_11__15_41_05__262730fzyxnprhgl/final_model.zip

Note: By default, all video sequences are supposed to last 1000 time steps of the simulation. To change this, adjust the value VIDEO_LENGTH in the said file. However, due to technical issues, there is still a tendency for videos to encompass more time steps that the provided number.

Evaluation

The included git-submodule Evaluation_CognitiveRobotics_Robo_Control contains a set of trained models, the evaluations of both the training and the final training outcome, and the tools used for the evaluation.

The tools contain a lot of inline-code and extensive class definitions explaining how the evaluation is done. Feel free to consult the attached project report for an overview.

All trained models are saved in separate folders. Their folders contain Training-check-points, files describing which parameters were used for training, and a documentation of the training process.

Further files:

callback.py

callback.py is used by the PPO agent to log training progress and to save checkpoints.

start.sh

start.sh is not particularly important to the project, but is the script for running the training process on the University's Peregrine cluster. It has been attached and kept for convenience of the developers.

kukaGymEnv.py

kukaGymEnv.py served as inspiration for designing our own Gym environment. It is copied from the example environments shipped with the Pybullet installation and kept for comparison.

That's it. Have fun with the repository!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CognitiveRobotics_Robo_Control

Introduction

Using the repository

Installation/Setup

Instructions

Training

Replaying

Recording video sequence

Evaluation

Further files:

callback.py

start.sh

kukaGymEnv.py

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
Evaluation_CognitiveRobotics_Robo_Control @ abe6b62		Evaluation_CognitiveRobotics_Robo_Control @ abe6b62
ParameterSettings		ParameterSettings
RobotModels		RobotModels
.gitmodules		.gitmodules
LICENSE.md		LICENSE.md
README.md		README.md
callback.py		callback.py
customRobotEnv.py		customRobotEnv.py
kukaGymEnv.py		kukaGymEnv.py
main.py		main.py
record_video_of_performing_trained_model.py		record_video_of_performing_trained_model.py
run_trained_model.py		run_trained_model.py
start.sh		start.sh

License

Bick95/CognitiveRobotics_Robo_Control

Folders and files

Latest commit

History

Repository files navigation

CognitiveRobotics_Robo_Control

Introduction

Using the repository

Installation/Setup

Instructions

Training

Replaying

Recording video sequence

Evaluation

Further files:

callback.py

start.sh

kukaGymEnv.py

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages