modelplane - an AI evaluator development platform

⚠️ Content warning

The sample datasets provided in the flightpaths/data directory are a truncated version of the datasets provided here. These data come with the following warning:

This dataset was created to elicit hazardous responses. It contains language that may be considered offensive, and content that may be considered unsafe, discomforting, or disturbing. Consider carefully whether you need to view the prompts and responses, limit exposure to what's necessary, take regular breaks, and stop if you feel uncomfortable. For more information on the risks, see this literature review on vicarious trauma.

Quickstart

You must have a docker engine installed on your system. The given docker-compose.yaml file has definitions for running the following services locally:

mlflow tracking server + postgres
jupyter

First, clone this repo:

git clone https://github.com/mlcommons/modelplane.git
cd modelplane

If you plan to share notebooks, clone modelplane-flights as well. Both modelplane and modelplane-flights should be in the same directory.

Finally, set up secrets for accessing SUTs, as needed in modelplane/flightpaths/config/secrets.toml. See modelbench for more details.

Running jupyter locally against the MLCommons mlflow server.

Ensure you have access to the MLCommons mlflow tracking and artifact server. If not, email [email protected] for access.
Modify .env.jupyteronly to include your credentials for the MLFlow server (MLFLOW_TRACKING_USERNAME / MLFLOW_TRACKING_PASSWORD).
- Alternatively, put the credentials in ~/.mlflow/credentials as described here.
To access modelbench-private code (assuming you have access), you must also set USE_MODELBENCH_PRIVATE=true in .env.jupyteronly. This will forward your ssh agent to the container allowing it to load the private repository to build the image.
Start jupyter with ./start_jupyter.sh. (You can add the -d flag to start in the background.)

Running jupyter and mlflow locally.

Adjust the .env file as needed. The committed .env / docker-compose.yaml will bring up mlflow, postgres, jupyter, and set up mlflow to use a local disk for artifact storage.
Start services with ./start_services.sh. (You can add the -d flag to start in the background.)
- If you are using the cli only, and not using jupyter, you must pass the --no-jupyter option: ./start_services.sh -d

Getting started in JupyterLab.

Visit the Jupyter Server. The token is configured in the .env file. You shouldn't need to enter it more than once (until the server is restarted). You can get started with the template notebook or create a new one.
You should see the flights directory, which leads to the modelplane-flights repository. Create a user directory for yourself (flights/users/{username}) and either copy an existing flightpath there or create a notebook from scratch.
- You can manage branches and commits for modelplane-flights directly from jupyter.

Caching

Annotator and SUT responses will be cached (locally) unless you pass the disable_cache flag to the appropriate calls.

CLI

You can also interact with modelplane via CLI. Run poetry run modelplane --help for more details.

Important: You must set the MLFLOW_TRACKING_URI environmental variable. For example, if you've brought up MLFlow using the fully local docker compose process above, you could run:

MLFLOW_TRACKING_URI=http://localhost:8080 poetry run modelplane get-sut-responses --sut_id {sut_id} --prompts tests/data/prompts.csv --experiment expname

After running the command, you'd see the run_id in the output from mlflow, or you can get the run_id via the MLFlow UI.

Basic Annotations

MLFLOW_TRACKING_URI=http://localhost:8080 poetry run modelplane annotate --annotator_id {annotator_id} --experiment expname --response_run_id {run_id}

Custom Ensembles

MLFLOW_TRACKING_URI=http://localhost:8080 poetry run modelplane annotate --annotator_id {annotator_id1} --annotator_id {annotator_id2} --ensemble_strategy {ensemble_strategy} --experiment expname --response_file path/to/response.csv

Private Ensemble

If you have access to the private ensemble, you can install with the needed extras

poetry install --extras modelbench-private

And then run annotations with:

MLFLOW_TRACKING_URI=http://localhost:8080 poetry run modelplane annotate --ensemble_id official --experiment expname --response_run_id {run_id}

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
.dvc		.dvc
.github		.github
flightpaths		flightpaths
src/modelplane		src/modelplane
tests		tests
.dvcignore		.dvcignore
.env		.env
.env.jupyteronly		.env.jupyteronly
.env.nojupyter		.env.nojupyter
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile.jupyter		Dockerfile.jupyter
Dockerfile.mlflow		Dockerfile.mlflow
Dockerfile.mockvllm		Dockerfile.mockvllm
LICENSE.md		LICENSE.md
README.md		README.md
docker-compose.yaml		docker-compose.yaml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
start_jupyter.sh		start_jupyter.sh
start_services.sh		start_services.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

modelplane - an AI evaluator development platform

⚠️ Content warning

Quickstart

Running jupyter locally against the MLCommons mlflow server.

Running jupyter and mlflow locally.

Getting started in JupyterLab.

Caching

CLI

Basic Annotations

Custom Ensembles

Private Ensemble

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 7

Uh oh!

Languages

License

mlcommons/modelplane

Folders and files

Latest commit

History

Repository files navigation

modelplane - an AI evaluator development platform

⚠️ Content warning

Quickstart

Running jupyter locally against the MLCommons mlflow server.

Running jupyter and mlflow locally.

Getting started in JupyterLab.

Caching

CLI

Basic Annotations

Custom Ensembles

Private Ensemble

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 7

Uh oh!

Languages

Packages