Skip to content

Conversation

jgallagher
Copy link
Contributor

This PR adds a copy of the omdb binary into the Nexus zone. This allows an omdb running in the switch zone to fetch a potentially-different omdb binary from a running Nexus instance. In a steady state this is useless, but during an update, the switch zone is updated much earlier than the database schema or Nexus, leading to difficulty observing the progress of the update. As a pretty cheap workaround, shipping omdb with Nexus means we can always grab a version that matches the currently-active Nexus (which will always match the current database schema).

Closes #9075, which has more details about the kinds of errors we can see mid-update.

@jgallagher
Copy link
Contributor Author

Testing on a4x2; this is the "not really useful" bit but confirming that all the plumbing is correct:

root@oxz_switch:~# omdb nexus fetch-omdb
error: the following required arguments were not provided:
  <OUTPUT>

Usage: omdb nexus fetch-omdb <OUTPUT>

For more information, try '--help'.
root@oxz_switch:~# omdb nexus fetch-omdb my-omdb
note: Nexus URL not specified.  Will pick one from DNS.
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using Nexus URL http://[fd00:1122:3344:101::6]:12232
root@oxz_switch:~# ls -lh
total 114809
-rwx------   1 root     root        114M Oct  8 17:11 my-omdb
root@oxz_switch:~# ./my-omdb
Omicron debugger (unstable)

Usage: my-omdb [OPTIONS] <COMMAND>

Commands:
  crucible-agent   Debug a specific crucible-agent
  crucible-pantry  Query a specific crucible-pantry
  db               Query the control plane database (CockroachDB)
  mgs              Debug a specific Management Gateway Service instance
  nexus            Debug a specific Nexus instance
  oximeter         Query oximeter collector state
  oxql             Enter the Oximeter Query Language shell for interactive querying
  reconfigurator   Interact with the Reconfigurator system
  sled-agent       Debug a specific Sled
  help             Print this message or the help of the given subcommand(s)

Options:
      --log-level <LOG_LEVEL>  log level filter [env: LOG_LEVEL=] [default: warn]
      --color <COLOR>          Color output [default: auto] [possible values: auto, always, never]
  -h, --help                   Print help (see more with '--help')

Connection Options:
      --dns-server <DNS_SERVER>  [env: OMDB_DNS_SERVER=]

Safety Options:
  -w, --destructive  Allow potentially-destructive subcommands
root@oxz_switch:~# sha256sum ./my-omdb
641fd64213c0e5b848cea307429f28eb66366f872e6cbb4ccf840c6296cac3b4  ./my-omdb
root@oxz_switch:~# sha256sum `which omdb`
641fd64213c0e5b848cea307429f28eb66366f872e6cbb4ccf840c6296cac3b4  /opt/oxide/omdb/bin/omdb

Copy link
Collaborator

@davepacheco davepacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of adding a step to the end-to-end tests that uses omdb nexus fetch-omdb and runs that omdb to do something basic?

@jgallagher
Copy link
Contributor Author

What do you think of adding a step to the end-to-end tests that uses omdb nexus fetch-omdb and runs that omdb to do something basic?

The existing end-to-end tests only exercise the public API, AFAICT, but it was easy to add a check to the deploy job directly: ba33090

@jgallagher jgallagher merged commit 8c683c7 into main Oct 9, 2025
17 checks passed
@jgallagher jgallagher deleted the john/omdb-nexus-fetch-omdb branch October 9, 2025 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update order: omdb is out of sync with Nexus for the majority of a Nexus-driven update
2 participants