-
Notifications
You must be signed in to change notification settings - Fork 110
Add Kubernetes Registry proposal with catalog system design #1641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hey @dmartinol thank you so much for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed proposals! Let me try to give a quick summary of what I got so far:
Changes proposed to ToolHive (thv):
- Registry API continues to be part of ToolHive’s API server
- Can register multiple registries (file-based JSON, API, git-backed (still JSON probably), etc.)
- Search works across all registries
- ToolHive keeps a cached copy of registries and refreshes them periodically
- Includes a trusted catalogue, with APIs for submitting and reviewing (approve/deny) MCP servers promotions taken from external registries
- Templated MCP servers (can you share more about the differences between it and a regular entry in the registry?)
Changes related to Kubernetes:
- A separate Registry API server is available for others to use
- New Registry Controller handles MCPRegistry CRDs for different registry types (remote, file-based, etc.) and feeds them to the Registry API (presuming a separate server/service?)
- It also manages cache refreshes for all configured registries
I'm happy to chat more, but I wanted to make sure I understand the main changes that are being proposed/impacted at first 👍
- Reference another registry's REST API endpoint as a data source | ||
- Enables registry hierarchies and aggregation patterns across clusters | ||
- Supports filtering and transformation of upstream registry data | ||
- Works with any registry implementation that exposes the standard API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the standard API is the OpenAPI spec of the official MCP registry or is it something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works with any registry implementation that exposes the standard API
We can discuss the details, but it should work according to the specifiedformat
field, being capable to digest bothupstream
ortoolhive
APIs.
```bash | ||
# Add the official MCP community registry | ||
thv registry add community \ | ||
--url https://registry.modelcontextprotocol.io/servers.json \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is supposed to be an API (not sure they are planning to have an exported json file with all entries), but I think this covers the use case of having a remotely-hosted json file 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, probably the example is misleading, it should be more generic like --url <REGISTRY_DATA_URL>/servers.json
onUpdate: update | ||
``` | ||
|
||
### Creating an MCPServer from Registry Data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm if I understand the flow correctly - a client application queries the registry API, discovers and gets the metadata for a given server, generates this CRD and then if it wants to spawn it, it creates the CRD in the k8s cluster so it can get picked up by the existing toolhive operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Servers can be deployed manually (with or w/o matching labels) according to the workflow you described, or using the thv run --registry ...
.
The latter case, in a k8s environment, would deploy an MCPServer instance, instead of today's raw Deployment and StatefulSet (IIRC).
In short, yes, registries are there to simplify the discovery of available servers and the management of their lifecycle (deploy and list, for now).
MCPServerTemplate
s would go even further, and define some shared settings once for all the next deployed instances, but I'm not sure we need to go to that level of design. Realistically, how many deployments of the same server should we expect within the same environment? I'm scared this would just create some unnecessary overhead, ending with two resources for each deployed server (1 template, 1 server), instead of just 1 (the server).
Thanks for pointing that out. You’re aiming for consistency across Docker and Kubernetes, while my focus was on the cluster
Not sure it is really needed for the docker environment. Is it?
Same as before. But if the previous answer is yes, then yes.
Again, in a local, docker environment, does it make any sense? In a production, kubernetes environment, we can create layers of registries with different trust level and define access scopes to prevent untrusted servers from reaching production environments. Not sure this is the case for the local one.
I think I lost some examples during the latest reviews, but the idea was to have a prefilled MCPServer template allowing to use template parameters to specify the actual values with a dedicated command, as we have for Templates in the openshift.io group. Reference Then, a dedicated The advantage is to specify the repeated configuration sections just once (e.g. the
👍 |
- **REST API**: HTTP endpoints for programmatic registry and server discovery | ||
- **Authentication**: Integration with Kubernetes RBAC and service account tokens | ||
- **Filtering**: Query servers by registry, category, transport type, and custom labels | ||
- **Format Support**: Return data in both ToolHive and upstream registry formats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This migth be a silly question, but how much difference is there now between toolhive and upstream registry formats? I thought our goal was to use the upstream registry format with vendor extensions. Does toolhive here mean supporting the extensions? (maybe this is a question for @rdimitrov )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not the main goal of the PR, but I tried to match the previous proposal tracked as upstream-mcp-registry-format-support.md.
My understanding was that thv
would keep its proprietary format and use the extensions mechanisms to export using the upstream format (and vice versa), but if this is not the case it would even simplify the PR by removing the format
conversions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we talk about the official MCP registry it consist of:
- an OpenAPI schema for the registry API server (once it goes live this would be publicly available for everyone to use)
- and a json schema that covers how you describe an MCP server (the so-called server.json)
On the Toolhive side:
- We are in the process of moving our registry from our format (i.e ImageMetadata) to follow the upstream format (aka server.json).
- Note that the structure of the actual registry.json file will slightly change too, but this is expected as this part is really specific to ToolHive (there's no community effort around adopting a file representation of a registry catalogue, at least not yet). Here's a preview of the new format - link.
- The above will set the foundation that would allows us to then add support for the registry API as well as any other compliant registries). In your proposal I think this maps to a remote registry source.
- Note that ToolHive's API will probably be a superset of this too so other registry clients besides Toolhive can use it.
|
||
Declarative operation CRDs for GitOps compatibility: | ||
- `MCPRegistryImportJob`: Declarative import operations | ||
- `MCPRegistryExportJob`: Declarative export operations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the use-case for export?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data backup? Anyway, I agree it can be dropped for now: if the original data source is not mutable from the registry itself (we only import and sync), then it's probably useless.
annotations: | ||
registry.toolhive.io/source: upstream-community | ||
spec: | ||
image: "mcpproject/filesystem-server:latest" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this say something like mcpproject/upstream-community/filesystem-server:latest
?
in other words, would this point to filesystem-server
from upstream-community
in the mcpproject
namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmartinol Shouldn't this say something like mcpproject/upstream-community/filesystem-server:latest ?
in other words, would this point to filesystem-server from upstream-community in the mcpproject namespace?
I think it means something like the mcpproject/filesystem-server:latest image in docker.hub (or the default container image). I don't think we want to hold an image registry into the MCPRegistry upstream-community, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry ignore me, I was conflating two things I think. I thought one of the goals was to also, instead of image, to be able to point to a "reference" or "record" from the registry
As agreed in previous conversations, I share here a link to a design document for an initial MVP#1, feel free to comment! |
Signed-off-by: Daniele Martinoli <[email protected]>
5fa7f0c
to
3b3a44e
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1641 +/- ##
==========================================
+ Coverage 39.98% 40.06% +0.08%
==========================================
Files 180 180
Lines 20911 20911
==========================================
+ Hits 8361 8378 +17
+ Misses 11938 11918 -20
- Partials 612 615 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Related to: #1114
A proposal to define a Kubernetes deployment of the ToolHive Registry.
Goals