[Ready For Review][AQUA] Add Supporting Fine-Tuned Models in Multi-Model Deployment #1186

mrDzurb · 2025-05-16T21:39:41Z

Description

The current implementation of Multi-Model Deployment in AQUA supports base models only. Fine-tuned models, however, are a critical part of many customer workflows - allowing them to adapt base models to domain-specific use cases.
This PR introduces support for deploying fine-tuned LLM models as part of a multi-model deployment group on the VLLM container.

Implementation

In the first iteration, we will treat each selected model, whether it's a base model or a fine-tuned variant—as an independent entity. Even if multiple fine-tuned models share the same base model, each one will be deployed in its own isolated VLLM instance.

On the SMC side, we will leverage VLLM's capability to dynamically merge LoRA adapter weights during runtime. This means each VLLM instance will load the base model and its corresponding fine-tuned weights independently.

To avoid routing conflicts caused by multiple instances using the same base model name, we will route the base model name to one instance only, but we will not advertise this base model as an endpoint to users (This is current behavior with Single Model Deployment).

This configuration structure will prepare us for future enhancements, such as stacked fine-tuned deployments, where multiple fine-tuned variants are hosted under a single base model within one VLLM instance. However, this future enhancement will apply to single-model deployments initially.

In a second iteration, we will explore expanding this capability to multi-model deployments, enabling grouped deployment of fine-tuned variants with shared GPU allocation. That enhancement will require additional work across the ADS SDK, AQUA UI, and validation logic.

Related PRs

…Base Model for Fine-Tuned Models (#1185)

github-actions · 2025-05-16T22:12:08Z

📌 Cov diff with main:

📌 Overall coverage:

…ation to Support FT Models (#1188) Co-authored-by: Liz Johnson <[email protected]>

VipulMascarenhas

lgtm 👍

VipulMascarenhas · 2025-05-29T00:47:04Z

ads/aqua/app.py

@@ -284,8 +284,11 @@ def if_artifact_exist(self, model_id: str, **kwargs) -> bool:
                logger.info(f"Artifact not found in model {model_id}.")
                return False

+    @cached(cache=TTLCache(maxsize=1, ttl=timedelta(minutes=1), timer=datetime.now))


isn't 1 min TTL too short here? Makes sense if we have multiple request within a 1 minute interval, probably we can extend to 5 mins or so if user tries out different model combinations for a bit and we want to return cached config here.

VipulMascarenhas · 2025-05-29T00:49:17Z

ads/aqua/app.py

-            raise AquaRuntimeError(f"Target model {oci_model.id} is not an Aqua model.")
+            logger.debug(f"Target model {oci_model.id} is not an Aqua model.")
+            return ModelConfigResult(config=config, model_details=oci_model)
+            # raise AquaRuntimeError(f"Target model {oci_model.id} is not an Aqua model.")


nit: remove commented code

VipulMascarenhas · 2025-05-29T01:04:48Z

ads/aqua/modeldeployment/config_loader.py

+        total_gpus_available (int, optional): The total number of GPUs available for this shape.
+    """
+
+    models: Optional[List[GPUModelAllocation]] = Field(


do we need to add protected_namespaces = () in Config since we have an attribute here with model* to avoid warnings when running CLI commands?

VipulMascarenhas · 2025-05-29T01:09:41Z

ads/aqua/modeldeployment/config_loader.py

+        for model_id, config in deployment_configs.items():
+            # For multi model deployment, we cannot rely on .shape because some models, like Falcon-7B, can only be deployed on a single GPU card (A10.1).
+            # However, Falcon can also be deployed on a single card in other A10 shapes, such as A10.2.
+            # Our current configuration does not support this flexibility.


ideally we should have only relied on configuration for shape info, but this make sense.

github-actions · 2025-05-29T20:29:53Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-05-30T17:33:27Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-05-30T21:37:14Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-06-01T07:40:22Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-06-05T04:32:17Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-06-05T21:24:56Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-06-06T00:07:04Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-06-06T17:13:03Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-06-06T21:02:08Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-06-09T18:25:23Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-06-09T19:23:57Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-06-09T20:59:15Z

📌 Cov diff with main:

📌 Overall coverage:

VipulMascarenhas

added minor comment, rest looks good.

VipulMascarenhas · 2025-06-10T15:44:45Z

ads/aqua/common/entities.py

+        The model-by-reference path to the LoRA Module within the model artifact
+    """
+
+    model_id: Optional[str] = Field(None, description="The fine tuned model OCID to deploy.")


needs protected_namespaces = () in config to avoid warning messages showing up when running via CLI.

[AQUA][MMD] Add Support for Retrieving Deployment Configuration from …

f967984

…Base Model for Fine-Tuned Models (#1185)

mrDzurb requested review from darenr, mayoor, VipulMascarenhas, qiuosier and ahosler as code owners May 16, 2025 21:39

oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label May 16, 2025

Merge branch 'main' into feature/aqua_ft_mmd

ff44f84

mrDzurb requested review from elizjo and dipatidar May 16, 2025 21:40

mrDzurb added wip Work in progress do not merge for any issue that isn't ready for merging yet labels May 23, 2025

mrDzurb and others added 2 commits May 28, 2025 11:14

[Ready For Review][AQUA][MMD] Enhance Multi-Model Deployment Configur…

7c4f723

…ation to Support FT Models (#1188) Co-authored-by: Liz Johnson <[email protected]>

Merge branch 'main' into feature/aqua_ft_mmd

e4eccba

mrDzurb changed the title ~~[WIP][AQUA] Add Supporting Fine-Tuned Models in Multi-Model Deployment~~ [Ready For Review][AQUA] Add Supporting Fine-Tuned Models in Multi-Model Deployment May 28, 2025

mrDzurb removed wip Work in progress do not merge for any issue that isn't ready for merging yet labels May 28, 2025

VipulMascarenhas previously approved these changes May 29, 2025

View reviewed changes

Fixes by comments.

02b28bd

mrDzurb dismissed VipulMascarenhas’s stale review via 02b28bd May 29, 2025 20:00

Enhance the _fetch_deployment_config_from_metadata_and_oss method

5511fc8

patch to fix missing fine_tune_weights in MULTI_MODEL_CONFIG

e224945

mrDzurb changed the title ~~[Ready For Review][AQUA] Add Supporting Fine-Tuned Models in Multi-Model Deployment~~ [WIP][AQUA] Add Supporting Fine-Tuned Models in Multi-Model Deployment May 30, 2025

mrDzurb added 2 commits June 1, 2025 00:04

Exrends list of supported model types for the MMD

92f0b60

Fixes by comments

bcbc613

Added unit test for ModelGroupConfig class. (#1196)

8a125c6

lu-ohai and others added 2 commits June 5, 2025 13:55

Fix ft params issue (#1198)

1f0047c

Merge branch 'main' into feature/aqua_ft_mmd

f2f09db

Fixes unit tests

76defcd

mrDzurb changed the title ~~[WIP][AQUA] Add Supporting Fine-Tuned Models in Multi-Model Deployment~~ [Ready For Review][AQUA] Add Supporting Fine-Tuned Models in Multi-Model Deployment Jun 5, 2025

mrDzurb requested a review from VipulMascarenhas June 5, 2025 23:57

Merge branch 'main' into feature/aqua_ft_mmd

d917042

Fixes model name problem for the lora modeules

c204660

VipulMascarenhas previously approved these changes Jun 9, 2025

View reviewed changes

added FT name to valid evaluation model names + unit tests

0cab1de

elizjo dismissed VipulMascarenhas’s stale review via 0cab1de June 9, 2025 17:56

mrDzurb requested a review from VipulMascarenhas June 9, 2025 18:30

optional params for LoraModuleSpec

b63a757

elizjo added 2 commits June 9, 2025 13:05

fixed LoraModuleSpec field

2ea9458

fixed test cases, modified checking FT model names

15b5bf7

VipulMascarenhas approved these changes Jun 10, 2025

View reviewed changes

lu-ohai approved these changes Jun 10, 2025

View reviewed changes

elizjo approved these changes Jun 10, 2025

View reviewed changes

mrDzurb merged commit 9d51d44 into main Jun 10, 2025
24 checks passed

[Ready For Review][AQUA] Add Supporting Fine-Tuned Models in Multi-Model Deployment #1186

[Ready For Review][AQUA] Add Supporting Fine-Tuned Models in Multi-Model Deployment #1186

Uh oh!

Conversation

mrDzurb commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Implementation

Related PRs

Uh oh!

github-actions bot commented May 16, 2025

Uh oh!

VipulMascarenhas left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 29, 2025

Uh oh!

github-actions bot commented May 30, 2025

Uh oh!

github-actions bot commented May 30, 2025

Uh oh!

github-actions bot commented Jun 1, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

github-actions bot commented Jun 9, 2025

Uh oh!

github-actions bot commented Jun 9, 2025

Uh oh!

github-actions bot commented Jun 9, 2025

Uh oh!

VipulMascarenhas left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mrDzurb commented May 16, 2025 •

edited

Loading