Skip to content

Extend Storage tests to multiple backends #986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 53 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
8a07c58
start using datetimes with fractional seconds with mysql
bpkroth May 28, 2025
d6fc9e2
adjust tests to also check for fractional time
bpkroth May 28, 2025
65e7f08
Revert "start using datetimes with fractional seconds with mysql"
bpkroth May 28, 2025
b7dad42
Reapply "start using datetimes with fractional seconds with mysql"
bpkroth May 28, 2025
4bf0c5f
apply black on alembic commit
bpkroth May 28, 2025
8088734
preparing to support testing multiple backend engines for schema changes
bpkroth May 28, 2025
a2c3256
refactoring storage tests to check other db engines
bpkroth May 28, 2025
5d3f0d3
change column lengths for mysql
bpkroth May 28, 2025
01b9df4
more refactor of storage tests
bpkroth May 28, 2025
d578176
preparing to support testing multiple backend engines for schema changes
bpkroth May 28, 2025
3b3bf6c
refactoring storage tests to check other db engines
bpkroth May 28, 2025
1c9633c
change column lengths for mysql
bpkroth May 28, 2025
f5b7bf3
fixups
bpkroth May 28, 2025
7fdf06d
fixup
bpkroth May 28, 2025
1bf3c30
fixup
bpkroth May 28, 2025
bd5862f
cleanup
bpkroth May 28, 2025
a5e9c05
Merge branch 'refactor/storage-tests' into feature/mysql-schema-chang…
bpkroth May 28, 2025
67f84ea
fixup
bpkroth May 28, 2025
41199fc
format
bpkroth May 28, 2025
ef33825
Merge branch 'refactor/storage-tests' into feature/mysql-schema-chang…
bpkroth May 28, 2025
ee4a900
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 28, 2025
82220ec
fixup
bpkroth May 28, 2025
aa2621b
fixup
bpkroth May 28, 2025
a84d30b
switch to interprocesslock - already using that
bpkroth May 29, 2025
2341a10
address a lint issue
bpkroth May 29, 2025
45ad11f
restore the original alembic comments - moving to separate PR
bpkroth May 29, 2025
cfa07ea
Merge branch 'refactor/storage-tests' into feature/mysql-schema-chang…
bpkroth May 29, 2025
bbcf689
Revert "restore the original alembic comments - moving to separate PR"
bpkroth May 29, 2025
53ee6e2
more comments
bpkroth May 29, 2025
994a32a
mypy
bpkroth May 29, 2025
2faf643
pylint
bpkroth May 29, 2025
a6b8941
allow retrieving storage url from the environment
bpkroth May 29, 2025
9c5ac14
more alembic tweaks
bpkroth May 29, 2025
988bc90
remove env
bpkroth May 29, 2025
f720d0d
temporarily revert back to something like the original schema
bpkroth May 29, 2025
f250e61
Revert "temporarily revert back to something like the original schema"
bpkroth May 29, 2025
793c2e5
Reapply "temporarily revert back to something like the original schema"
bpkroth May 29, 2025
952fba0
include timezone
bpkroth May 29, 2025
d6617ba
make mysql datetimes support fractional seconds
bpkroth May 29, 2025
2f28d79
log the alembic target engine url
bpkroth May 29, 2025
1b0fb2c
engine no longer optional
bpkroth May 29, 2025
928f491
Revert "make mysql datetimes support fractional seconds"
bpkroth May 29, 2025
4863a5b
Enable alembic to detect datetime precision issues with MySQL
bpkroth Jun 2, 2025
6a697a0
Enable floating point seconds with mysql
bpkroth Jun 2, 2025
f7ccf26
Alembic script to add floating point seconds precision
bpkroth Jun 2, 2025
40374c2
fixup
bpkroth Jun 2, 2025
fe4171a
rework to only apply to mysql
bpkroth Jun 2, 2025
5a42a14
be sure to mark that version as required
bpkroth Jun 2, 2025
f273ea3
add that refactor too
bpkroth Jun 2, 2025
b1b3733
Fixups and refactors to allow two things
bpkroth Jun 2, 2025
c7146e0
Extend storage tests to check multiple backends by default
bpkroth Jun 2, 2025
faeb0e3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 2, 2025
ca0b565
Merge branch 'main' into tests/extend-storage-tests-to-multiple-backends
bpkroth Jun 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion mlos_bench/mlos_bench/config/storage/mysql.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"log_sql": false, // Write all SQL statements to the log.
// Parameters below must match kwargs of `sqlalchemy.URL.create()`:
"drivername": "mysql+mysqlconnector",
"database": "osat",
"database": "mlos_bench",
"username": "root",
"password": "PLACERHOLDER PASSWORD", // Comes from global config
"host": "localhost",
Expand Down
2 changes: 1 addition & 1 deletion mlos_bench/mlos_bench/config/storage/postgresql.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"log_sql": false, // Write all SQL statements to the log.
// Parameters below must match kwargs of `sqlalchemy.URL.create()`:
"drivername": "postgresql+psycopg2",
"database": "osat",
"database": "mlos_bench",
"username": "postgres",
"password": "PLACERHOLDER PASSWORD", // Comes from global config
"host": "localhost",
Expand Down
75 changes: 67 additions & 8 deletions mlos_bench/mlos_bench/storage/base_storage.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,12 @@
from __future__ import annotations

import logging
import os
from abc import ABCMeta, abstractmethod
from collections.abc import Iterator, Mapping
from contextlib import AbstractContextManager as ContextManager
from datetime import datetime
from subprocess import CalledProcessError
from types import TracebackType
from typing import Any, Literal

Expand All @@ -38,7 +40,7 @@
from mlos_bench.services.base_service import Service
from mlos_bench.storage.base_experiment_data import ExperimentData
from mlos_bench.tunables.tunable_groups import TunableGroups
from mlos_bench.util import get_git_info
from mlos_bench.util import get_git_info, get_git_root, path_join

_LOG = logging.getLogger(__name__)

Expand Down Expand Up @@ -187,16 +189,61 @@ def __init__( # pylint: disable=too-many-arguments
tunables: TunableGroups,
experiment_id: str,
trial_id: int,
root_env_config: str,
root_env_config: str | None,
description: str,
opt_targets: dict[str, Literal["min", "max"]],
git_repo: str | None = None,
git_commit: str | None = None,
rel_root_env_config: str | None = None,
):
self._tunables = tunables.copy()
self._trial_id = trial_id
self._experiment_id = experiment_id
(self._git_repo, self._git_commit, self._root_env_config) = get_git_info(
root_env_config
)
if root_env_config is None:
# Restoring from DB.
if not (git_repo and git_commit and rel_root_env_config):
raise ValueError(
"Missing required args: git_repo, git_commit, rel_root_env_config"
)
self._git_repo = git_repo
self._git_commit = git_commit
self._rel_root_env_config = rel_root_env_config

# Currently we only store the relative path of the root env config
# from the git repo it came from.
git_root = git_repo
if not os.path.exists(git_root):
try:
git_root = get_git_root(os.curdir)
except CalledProcessError:
_LOG.warning(
"Failed to find a git repo in the current working directory: %s",
os.curdir,
)
git_root = get_git_root(__file__)

self._abs_root_env_config = path_join(
git_root,
self._rel_root_env_config,
abs_path=True,
)
_LOG.info(
"Resolved relative root_config %s for experiment %s to %s",
self._rel_root_env_config,
self._experiment_id,
self._abs_root_env_config,
)
else:
if git_repo or git_commit or rel_root_env_config:
raise ValueError("Unexpected args: git_repo, git_commit, rel_root_env_config")
(
self._git_repo,
self._git_commit,
self._rel_root_env_config,
self._abs_root_env_config,
) = get_git_info(
root_env_config,
)
self._description = description
self._opt_targets = opt_targets
self._in_context = False
Expand Down Expand Up @@ -278,9 +325,21 @@ def description(self) -> str:
return self._description

@property
def root_env_config(self) -> str:
"""Get the Experiment's root Environment config file path."""
return self._root_env_config
def rel_root_env_config(self) -> str:
"""Get the Experiment's root Environment config's relative file path to the
git repo root.
"""
return self._rel_root_env_config

@property
def abs_root_env_config(self) -> str:
"""
Get the Experiment's root Environment config file path.

This returns the current absolute path to the root config for this process
instead of the path relative to the git repo root.
"""
return self._abs_root_env_config

@property
def tunables(self) -> TunableGroups:
Expand Down
11 changes: 7 additions & 4 deletions mlos_bench/mlos_bench/storage/sql/alembic.ini
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,10 @@ version_path_separator = os # Use os.pathsep. Default configuration used for ne
# output_encoding = utf-8

# See README.md for details.
# Uncomment one of these:
sqlalchemy.url = sqlite:///mlos_bench.sqlite
#sqlalchemy.url = mysql+mysqlconnector://root:password@localhost:3306/mlos_bench
#sqlalchemy.url = postgresql+psycopg2://root:password@localhost:5432/mlos_bench


[post_write_hooks]
Expand All @@ -72,10 +75,10 @@ sqlalchemy.url = sqlite:///mlos_bench.sqlite
# detail and examples

# format using "black" - use the console_scripts runner, against the "black" entrypoint
# hooks = black
# black.type = console_scripts
# black.entrypoint = black
# black.options = -l 79 REVISION_SCRIPT_FILENAME
hooks = black
black.type = console_scripts
black.entrypoint = black
black.options = REVISION_SCRIPT_FILENAME

# lint with attempts to fix using "ruff" - use the exec runner, execute a binary
# hooks = ruff
Expand Down
113 changes: 104 additions & 9 deletions mlos_bench/mlos_bench/storage/sql/alembic/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,49 +4,144 @@ This document contains some notes on how to use [`alembic`](https://alembic.sqla

## Overview

1. Create a blank `mlos_bench.sqlite` database file in the [`mlos_bench/storage/sql`](../) directory with the current schema using the following command:
1. Create a blank database instance in the [`mlos_bench/storage/sql`](../) directory with the current schema using the following command:

```sh
cd mlos_bench/storage/sql
rm mlos_bench.sqlite
mlos_bench --storage storage/sqlite.jsonc --create-update-storage-schema-only
```
This allows `alembic` to automatically generate a migration script from the current schema.

> NOTE: If your schema changes target a particular backend engine (e.g., using `with_variant`) you will need to use an engine with that config for this step.
> \
> In the remainder of this document we should some examples for different DB types.
> Pick the one you're targeting and stick with it thru the example.
> You may need to repeat the process several times to test all of them.
>
> - [ ] TODO: Add scripts to automatically do this for several different backend engines all at once.

For instance:

1. Start a temporary server either as a local file or in a docker instance

```sh
# sqlite
cd mlos_bench/storage/sql
rm -f mlos_bench.sqlite
```

```sh
# mysql
docker run -it --rm --name mysql-alembic --env MYSQL_ROOT_PASSWORD=password --env MYSQL_DATABASE=mlos_bench -p 3306:3306 mysql:latest
```

```sh
# postgres
docker run -it --rm --name postgres-alembic --env POSTGRES_PASSWORD=password --env POSTGRES_DB=mlos_bench -p 5432:5432 postgres:latest
```

1. Adjust the `sqlalchemy.url` in the [`alembic.ini`](../alembic.ini) file.

> This allows `alembic` to automatically generate a migration script from the current schema.
```ini
# Uncomment one of these.
# See README.md for details.

1. Adjust the [`mlos_bench/storage/sql/schema.py`](../schema.py) file to reflect the new desired schema.
#sqlalchemy.url = sqlite:///mlos_bench.sqlite
sqlalchemy.url = mysql+pymysql://root:password@localhost:3306/mlos_bench
#sqlalchemy.url = postgresql+psycopg2://root:password@localhost:5432/mlos_bench
```

1. Prime the DB schema

> Note: you may want to `git checkout main` first to make sure you're using the current schema here.

```sh
# sqlite
mlos_bench --storage storage/sqlite.jsonc --create-update-storage-schema-only --password=password
```

```sh
# mysql
mlos_bench --storage storage/mysql.jsonc --create-update-storage-schema-only --password=password
```

```sh
# postgres
mlos_bench --storage storage/postgresql.jsonc --create-update-storage-schema-only --password=password
```

1. Now, adjust the [`mlos_bench/storage/sql/schema.py`](../schema.py) file to reflect the new desired schema.

> Don't forget to do this on a new branch.
> \
> Keep each change small and atomic.
> \
> For example, if you want to add a new column, do that in one change.
> If you want to rename a column, do that in another change.

1. Generate a new migration script with the following command:

```sh
alembic revision --autogenerate -m "Descriptive text about the change."
alembic revision --autogenerate -m "CHANGEME: Descriptive text about the change."
```

1. Review the generated migration script in the [`mlos_bench/storage/sql/alembic/versions`](./versions/) directory.

1. Verify that the migration script works by running the following command:

```sh
# sqlite
mlos_bench --storage storage/sqlite.jsonc --create-update-storage-schema-only
```

```sh
# mysql:
mlos_bench --storage storage/mysql.jsonc --create-update-storage-schema-only --password=password
```

```sh
# postgres:
mlos_bench --storage storage/postgresql.jsonc --create-update-storage-schema-only --password=password
```

> Normally this would be done with `alembic upgrade head`, but this command is convenient to ensure if will work with the `mlos_bench` command line interface as well.

Examine the results using something like:

```sh
# For sqlite:
sqlite3 mlos_bench.sqlite .schema
sqlite3 mlos_bench.sqlite "SELECT * FROM alembic_version;"
```

```sh
# For mysql:
mysql --user root --password=password --host localhost --protocol tcp --database mlos_bench -e "SHOW TABLES; SELECT * FROM alembic_version;"
```

```sh
# For postgres:
PGPASSWORD=password psql -h localhost -p 5432 -U postgres mlos_bench -c "SELECT table_name FROM information_schema.tables WHERE table_schema='public' and table_catalog='mlos_bench'; SELECT * FROM alembic_version;"
```

> Use different CLI clients for targeting other engines.

1. If the migration script works, commit the changes to the [`mlos_bench/storage/sql/schema.py`](../schema.py) and [`mlos_bench/storage/sql/alembic/versions`](./versions/) files.

> Be sure to update the latest version in the [`test_storage_schemas.py`](../../../tests/storage/test_storage_schemas.py) file as well.

1. Cleanup any server instances you started.

For instance:

```sh
rm mlos_bench/storage/sql/mlos_bench.sqlite
```

```sh
docker kill mysql-alembic
```

```sh
docker kill postgres-alembic
```

1. Merge that to the `main` branch.

1. Might be good to cut a new `mlos_bench` release at this point as well.
Loading
Loading