From b6962897879d95bdc80b1c737d4dd9048b467a10 Mon Sep 17 00:00:00 2001 From: Mustafa Date: Thu, 22 May 2025 23:34:42 +0000 Subject: [PATCH 01/13] update compose.yaml Signed-off-by: Mustafa update the documents Signed-off-by: Mustafa update the documents Signed-off-by: Mustafa document update Signed-off-by: Mustafa update the test file Signed-off-by: Mustafa update the test file Signed-off-by: Mustafa update README Signed-off-by: Mustafa update README Signed-off-by: Mustafa --- FinanceAgent/README.md | 182 +++---------- .../docker_compose/intel/hpu/gaudi/README.md | 184 ++++++++++++++ .../intel/hpu/gaudi/compose.yaml | 239 ++++++++++++------ .../intel/hpu/gaudi/dataprep_compose.yaml | 82 ------ .../intel/hpu/gaudi/launch_agents.sh | 36 --- .../intel/hpu/gaudi/launch_dataprep.sh | 15 -- .../intel/hpu/gaudi/launch_vllm.sh | 7 - .../intel/hpu/gaudi/vllm_compose.yaml | 35 --- FinanceAgent/docker_compose/intel/set_env.sh | 99 ++++++++ FinanceAgent/tests/test_compose_on_gaudi.sh | 63 ++--- 10 files changed, 509 insertions(+), 433 deletions(-) create mode 100644 FinanceAgent/docker_compose/intel/hpu/gaudi/README.md delete mode 100644 FinanceAgent/docker_compose/intel/hpu/gaudi/dataprep_compose.yaml delete mode 100644 FinanceAgent/docker_compose/intel/hpu/gaudi/launch_agents.sh delete mode 100644 FinanceAgent/docker_compose/intel/hpu/gaudi/launch_dataprep.sh delete mode 100644 FinanceAgent/docker_compose/intel/hpu/gaudi/launch_vllm.sh delete mode 100644 FinanceAgent/docker_compose/intel/hpu/gaudi/vllm_compose.yaml create mode 100644 FinanceAgent/docker_compose/intel/set_env.sh diff --git a/FinanceAgent/README.md b/FinanceAgent/README.md index 64ce01cc0a..bb464579fa 100644 --- a/FinanceAgent/README.md +++ b/FinanceAgent/README.md @@ -1,6 +1,30 @@ -# Finance Agent +# Finance Agent Example -## 1. Overview +## Table of Contents + +- [Overview](#overview) +- [Problem Motivation](#problem-motivation) +- [Architecture](#architecture) + - [High-Level Diagram](#high-level-diagram) + - [OPEA Microservices Diagram](#opea-microservices-diagram) +- [Deployment Options](#deployment-options) +- [Contribution](#contribution) + + + +## Overview + +The Finance Agent example showcases a hierarchical multi-agent system designed to assist users with financial document processing and analysis. It provides three main functionalities: summarizing lengthy financial documents, answering queries related to financial documents, and conducting research to generate investment reports on public companies. + +Users interact with the system via a graphical user interface (UI), where requests are managed by a supervisor agent that delegates tasks to worker agents or the summarization microservice. The system supports document uploads through the UI for processing. + + +## Problem Motivation +Navigating and analyzing extensive financial documents can be challenging and time-consuming. Users often require concise summaries, answers to specific queries, or comprehensive investment reports. The Finance Agent addresses these needs by automating document summarization, query answering, and research tasks, thereby enhancing productivity and decision-making efficiency. + +## Architecture +### High-Level Diagram +The Finance Agent system is structured as a hierarchical multi-agent architecture. User interactions are managed by a supervisor agent, which coordinates tasks among worker agents and the summarization microservice. The system supports document uploads and processing through the UI. The architecture of this Finance Agent example is shown in the figure below. The agent is a hierarchical multi-agent system and has 3 main functions: @@ -12,6 +36,7 @@ The user interacts with the supervisor agent through the graphical UI. The super ![Finance Agent Architecture](assets/finance_agent_arch.png) +### OPEA Microservices Diagram The architectural diagram of the `dataprep` microservice is shown below. We use [docling](https://github.com/docling-project/docling) to extract text from PDFs and URLs into markdown format. Both the full document content and tables are extracted. We then use an LLM to extract metadata from the document, including the company name, year, quarter, document type, and document title. The full document markdown then gets chunked, and LLM is used to summarize each chunk, and the summaries are embedded and saved to a vector database. Each table is also summarized by LLM and the summaries are embedded and saved to the vector database. The chunks and tables are also saved into a KV store. The pipeline is designed as such to improve retrieval accuracy of the `search_knowledge_base` tool used by the Question Answering worker agent. ![dataprep architecture](assets/fin_agent_dataprep.png) @@ -30,154 +55,17 @@ The Question Answering worker agent uses `search_knowledge_base` tool to get rel ![finqa search tool arch](assets/finqa_tool.png) -## 2. Getting started - -### 2.1 Download repos - -```bash -mkdir /path/to/your/workspace/ -export WORKDIR=/path/to/your/workspace/ -cd $WORKDIR -git clone https://github.com/opea-project/GenAIExamples.git -``` - -### 2.2 Set up env vars - -```bash -export ip_address="External_Public_IP" -export no_proxy=${your_no_proxy},${ip_address} -export HF_CACHE_DIR=/path/to/your/model/cache/ -export HF_TOKEN= -export FINNHUB_API_KEY= # go to https://finnhub.io/ to get your free api key -export FINANCIAL_DATASETS_API_KEY= # go to https://docs.financialdatasets.ai/ to get your free api key -``` - -### 2.3 [Optional] Build docker images - -Only needed when docker pull failed. - -```bash -cd $WORKDIR/GenAIExamples/FinanceAgent/docker_image_build -# get GenAIComps repo -git clone https://github.com/opea-project/GenAIComps.git -# build the images -docker compose -f build.yaml build --no-cache -``` - -If deploy on Gaudi, also need to build vllm image. - -```bash -cd $WORKDIR -git clone https://github.com/HabanaAI/vllm-fork.git -# get the latest release tag of vllm gaudi -cd vllm-fork -VLLM_VER=$(git describe --tags "$(git rev-list --tags --max-count=1)") -echo "Check out vLLM tag ${VLLM_VER}" -git checkout ${VLLM_VER} -docker build --no-cache -f Dockerfile.hpu -t opea/vllm-gaudi:latest --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -``` - -## 3. Deploy with docker compose - -### 3.1 Launch vllm endpoint - -Below is the command to launch a vllm endpoint on Gaudi that serves `meta-llama/Llama-3.3-70B-Instruct` model on 4 Gaudi cards. - -```bash -cd $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi -bash launch_vllm.sh -``` - -### 3.2 Prepare knowledge base - -The commands below will upload some example files into the knowledge base. You can also upload files through UI. - -First, launch the redis databases and the dataprep microservice. - -```bash -# inside $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi/ -bash launch_dataprep.sh -``` - -Validate datat ingest data and retrieval from database: - -```bash -python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option ingest -python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option get -``` - -### 3.3 Launch the multi-agent system - -The command below will launch 3 agent microservices, 1 docsum microservice, 1 UI microservice. - -```bash -# inside $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi/ -bash launch_agents.sh -``` - -### 3.4 Validate agents - -FinQA Agent: - -```bash -export agent_port="9095" -prompt="What is Gap's revenue in 2024?" -python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port -``` - -Research Agent: - -```bash -export agent_port="9096" -prompt="generate NVDA financial research report" -python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port --tool_choice "get_current_date" --tool_choice "get_share_performance" -``` - -Supervisor Agent single turns: - -```bash -export agent_port="9090" -python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream -``` - -Supervisor Agent multi turn: - -```bash -python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --multi-turn --stream - -``` - -## How to interact with the agent system with UI - -The UI microservice is launched in the previous step with the other microservices. -To see the UI, open a web browser to `http://${ip_address}:5175` to access the UI. Note the `ip_address` here is the host IP of the UI microservice. - -1. Create Admin Account with a random value - -2. Enter the endpoints in the `Connections` settings - - First, click on the user icon in the upper right corner to open `Settings`. Click on `Admin Settings`. Click on `Connections`. - - Then, enter the supervisor agent endpoint in the `OpenAI API` section: `http://${ip_address}:9090/v1`. Enter the API key as "empty". Add an arbitrary model id in `Model IDs`, for example, "opea_agent". The `ip_address` here should be the host ip of the agent microservice. - - Then, enter the dataprep endpoint in the `Icloud File API` section. You first need to enable `Icloud File API` by clicking on the button on the right to turn it into green and then enter the endpoint url, for example, `http://${ip_address}:6007/v1`. The `ip_address` here should be the host ip of the dataprep microservice. - - You should see screen like the screenshot below when the settings are done. - -![opea-agent-setting](assets/ui_connections_settings.png) - -3. Upload documents with UI - - Click on the `Workplace` icon in the top left corner. Click `Knowledge`. Click on the "+" sign to the right of `Icloud Knowledge`. You can paste an url in the left hand side of the pop-up window, or upload a local file by click on the cloud icon on the right hand side of the pop-up window. Then click on the `Upload Confirm` button. Wait till the processing is done and the pop-up window will be closed on its own when the data ingestion is done. See the screenshot below. - Note: the data ingestion may take a few minutes depending on the length of the document. Please wait patiently and do not close the pop-up window. +## Deployment Options +This CodeGen example can be deployed manually on various hardware platforms using Docker Compose or Kubernetes. Select the appropriate guide based on your target environment: -![upload-doc-ui](assets/upload_doc_ui.png) +| Hardware | Deployment Mode | Guide Link | +| :-------------- | :------------------- | :----------------------------------------------------------------------- | +| Intel Gaudi HPU | Single Node (Docker) | [Gaudi Docker Compose Guide](./docker_compose/intel/hpu/gaudi/README.md) | -4. Test agent with UI +_Note: Building custom microservice images can be done using the resources in [GenAIComps](https://github.com/opea-project/GenAIComps)._ - After the settings are done and documents are ingested, you can start to ask questions to the agent. Click on the `New Chat` icon in the top left corner, and type in your questions in the text box in the middle of the UI. - The UI will stream the agent's response tokens. You need to expand the `Thinking` tab to see the agent's reasoning process. After the agent made tool calls, you would also see the tool output after the tool returns output to the agent. Note: it may take a while to get the tool output back if the tool execution takes time. +## Contribution +We welcome contributions to the OPEA project. Please refer to the contribution guidelines for more information. -![opea-agent-test](assets/opea-agent-test.png) diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md new file mode 100644 index 0000000000..db2e123ad4 --- /dev/null +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md @@ -0,0 +1,184 @@ +# Deploy Finance Agent on Intel Gaudi HPU with Docker Compose +This README provides instructions for deploying the Finance Agent application using Docker Compose on systems equipped with Intel Gaudi HPUs. + +## Table of Contents + +- [Overview](#overview) +- [Prerequisites](#prerequisites) +- [Start Deployment](#start-deployment) +- [Validate Services](#validate-services) +- [Accessing the User Interface (UI)](#accessing-the-user-interface-ui) + +## Overview + +This guide focuses on running the pre-configured Finance Agent service using Docker Compose on Intel Gaudi HPUs. It leverages containers optimized for Gaudi for the LLM serving component, along with CPU-based containers for other microservices like embedding, retrieval, data preparation and the UI. + +## Prerequisites +- Docker and Docker Compose installed. +- Intel Gaudi HPU(s) with the necessary drivers and software stack installed on the host system. (Refer to Intel Gaudi Documentation). +- Git installed (for cloning repository). +- Hugging Face Hub API Token (for downloading models). +- Access to the internet (or a private model cache). + +Clone the GenAIExamples repository: + +```shell + mkdir /path/to/your/workspace/ + export WORKDIR=/path/to/your/workspace/ + cd $WORKDIR + git clone https://github.com/opea-project/GenAIExamples.git + cd GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi +``` + +## Start Deployment +This uses the default vLLM-based deployment profile (vllm-gaudi-server). +### Configure Environment +Set required environment variables in your shell: + +```shell + # Replace with your Hugging Face Hub API token + export HF_TOKEN="your_huggingface_token" + # Path to your model cache + export HF_CACHE_DIR="./data" + # Go to https://finnhub.io/ to get your free api key + export FINNHUB_API_KEY= + # Go to https://docs.financialdatasets.ai/ to get your free api key + export FINANCIAL_DATASETS_API_KEY= + + # Optional: Configure HOST_IP if needed + # Replace with your host's external IP address (do not use localhost or 127.0.0.1). + # export HOST_IP=$(hostname -I | awk '{print $1}') + # Optional: Configure proxy if needed + # export http_proxy="your_http_proxy" + # export https_proxy="your_https_proxy" + # export no_proxy="localhost,127.0.0.1,${HOST_IP}" # Add other hosts if necessary + + source ../../set_env.sh +``` + +Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (LLM_SERVICE_PORT, TEI_EMBEDDER_PORT, etc.) are set if not using defaults from the compose file. For instance, edit the set_env.sh to change the LLM model: + +### Start Services +#### Deploy with Docker Compose +Below is the command to launch services + - vllm-gaudi-server + - tei-embedding-serving + - redis-vector-db + - redis-kv-store + - dataprep-redis-server-finance + - finqa-agent-endpoint + - research-agent-endpoint + - docsum-vllm-gaudi + - supervisor-agent-endpoint + - agent-ui + + +```shell + docker compose up -d +``` + +#### [Optional] Build docker images + +Only needed when docker pull failed. + +```bash + cd $WORKDIR/GenAIExamples/FinanceAgent/docker_image_build + # get GenAIComps repo + git clone https://github.com/opea-project/GenAIComps.git + # build the images + docker compose -f build.yaml build --no-cache +``` + +If deploy on Gaudi, also need to build vllm image. + +```bash + cd $WORKDIR + git clone https://github.com/HabanaAI/vllm-fork.git + # get the latest release tag of vllm gaudi + cd vllm-fork + VLLM_VER=$(git describe --tags "$(git rev-list --tags --max-count=1)") + echo "Check out vLLM tag ${VLLM_VER}" + git checkout ${VLLM_VER} + docker build --no-cache -f Dockerfile.hpu -t opea/vllm-gaudi:latest --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy +``` + + +## Validate Services +Wait several minutes for models to download and services to initialize (Gaudi initialization can take time). Check container logs (docker compose logs -f , especially vllm-gaudi-server). +```bash + docker logs --tail 2000 -f vllm-gaudi-server +``` + +### Validate Data Services +Ingest data and retrieval from database + +```bash + python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option ingest + python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option get +``` + +### Validate Agents + +FinQA Agent: + +```bash + export agent_port="9095" + prompt="What is Gap's revenue in 2024?" + python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port +``` + +Research Agent: + +```bash + export agent_port="9096" + prompt="generate NVDA financial research report" + python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port --tool_choice "get_current_date" --tool_choice "get_share_performance" +``` + +Supervisor Agent single turns: + +```bash + export agent_port="9090" + python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream +``` + +Supervisor Agent multi turn: + +```bash + python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --multi-turn --stream +``` + +## Accessing the User Interface (UI) + +The UI microservice is launched in the previous step with the other microservices. +To see the UI, open a web browser to `http://${ip_address}:5175` to access the UI. Note the `ip_address` here is the host IP of the UI microservice. + +1. Create Admin Account with a random value + +2. Enter the endpoints in the `Connections` settings + + First, click on the user icon in the upper right corner to open `Settings`. Click on `Admin Settings`. Click on `Connections`. + + Then, enter the supervisor agent endpoint in the `OpenAI API` section: `http://${ip_address}:9090/v1`. Enter the API key as "empty". Add an arbitrary model id in `Model IDs`, for example, "opea_agent". The `ip_address` here should be the host ip of the agent microservice. + + Then, enter the dataprep endpoint in the `Icloud File API` section. You first need to enable `Icloud File API` by clicking on the button on the right to turn it into green and then enter the endpoint url, for example, `http://${ip_address}:6007/v1`. The `ip_address` here should be the host ip of the dataprep microservice. + + You should see screen like the screenshot below when the settings are done. + +![opea-agent-setting](../../../../assets/ui_connections_settings.png) + +3. Upload documents with UI + + Click on the `Workplace` icon in the top left corner. Click `Knowledge`. Click on the "+" sign to the right of `Icloud Knowledge`. You can paste an url in the left hand side of the pop-up window, or upload a local file by click on the cloud icon on the right hand side of the pop-up window. Then click on the `Upload Confirm` button. Wait till the processing is done and the pop-up window will be closed on its own when the data ingestion is done. See the screenshot below. + + Note: the data ingestion may take a few minutes depending on the length of the document. Please wait patiently and do not close the pop-up window. + +![upload-doc-ui](../../../../assets/upload_doc_ui.png) + +4. Test agent with UI + + After the settings are done and documents are ingested, you can start to ask questions to the agent. Click on the `New Chat` icon in the top left corner, and type in your questions in the text box in the middle of the UI. + + The UI will stream the agent's response tokens. You need to expand the `Thinking` tab to see the agent's reasoning process. After the agent made tool calls, you would also see the tool output after the tool returns output to the agent. Note: it may take a while to get the tool output back if the tool execution takes time. + +![opea-agent-test](../../../../assets/opea-agent-test.png) diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml index 997aade843..c31c92a690 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml @@ -1,37 +1,147 @@ # Copyright (C) 2024 Intel Corporation # SPDX-License-Identifier: Apache-2.0 + +x-common-environment: + &common-env + no_proxy: ${NO_PROXY} + http_proxy: ${HTTP_PROXY} + https_proxy: ${HTTPS_PROXY} + +x-common-agent-environment: + &common-agent-env + <<: *common-env + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + llm_endpoint_url: ${LLM_ENDPOINT} + model: ${LLM_MODEL_ID} + REDIS_URL_VECTOR: ${REDIS_URL_VECTOR} + REDIS_URL_KV: ${REDIS_URL_KV} + TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} + ip_address: ${IP_ADDRESS} + strategy: react_llama + require_human_feedback: "false" + services: + + vllm-service: + image: ${REGISTRY:-opea}/vllm-gaudi:${TAG:-latest} + container_name: vllm-gaudi-server + ports: + - "8086:8000" + volumes: + - ${HF_CACHE_DIR}:/data + environment: + <<: *common-env + HF_TOKEN: ${HF_TOKEN} + HUGGING_FACE_HUB_TOKEN: ${HF_TOKEN} + HF_HOME: ./data + HABANA_VISIBLE_DEVICES: all + OMPI_MCA_btl_vader_single_copy_mechanism: none + LLM_MODEL_ID: ${LLM_MODEL_ID} + VLLM_TORCH_PROFILER_DIR: "/mnt" + VLLM_SKIP_WARMUP: true + PT_HPU_ENABLE_LAZY_COLLECTIVES: true + healthcheck: + test: ["CMD-SHELL", "curl -f http://$HOST_IP:8086/health || exit 1"] + interval: 10s + timeout: 10s + retries: 100 + runtime: habana + cap_add: + - SYS_NICE + ipc: host + command: --model ${LLM_MODEL_ID} --tensor-parallel-size ${NUM_CARDS} --host 0.0.0.0 --port 8000 --max-seq-len-to-capture $MAX_LEN + + tei-embedding-serving: + image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 + container_name: tei-embedding-serving + entrypoint: /bin/sh -c "apt-get update && apt-get install -y curl && text-embeddings-router --json-output --model-id ${EMBEDDING_MODEL_ID} --auto-truncate" + ports: + - "${TEI_EMBEDDER_PORT:-10221}:80" + volumes: + # - "./data:/data" + - ${HF_CACHE_DIR}:/data + shm_size: 1g + environment: + <<: *common-env + HF_TOKEN: ${HF_TOKEN} + host_ip: ${HOST_IP} + healthcheck: + test: ["CMD", "curl", "-f", "http://${HOST_IP}:${TEI_EMBEDDER_PORT}/health"] + interval: 10s + timeout: 6s + retries: 48 + + redis-vector-db: + image: redis/redis-stack:7.2.0-v9 + container_name: redis-vector-db + ports: + - "${REDIS_PORT1:-6379}:6379" + - "${REDIS_PORT2:-8001}:8001" + environment: + <<: *common-env + healthcheck: + test: ["CMD", "redis-cli", "ping"] + timeout: 10s + retries: 3 + start_period: 10s + + redis-kv-store: + image: redis/redis-stack:7.2.0-v9 + container_name: redis-kv-store + ports: + - "${REDIS_PORT3:-6380}:6379" + - "${REDIS_PORT4:-8002}:8001" + environment: + <<: *common-env + healthcheck: + test: ["CMD", "redis-cli", "ping"] + timeout: 10s + retries: 3 + start_period: 10s + + dataprep-redis-finance: + image: ${REGISTRY:-opea}/dataprep:${TAG:-latest} + container_name: dataprep-redis-server-finance + depends_on: + redis-vector-db: + condition: service_healthy + redis-kv-store: + condition: service_healthy + tei-embedding-serving: + condition: service_healthy + ports: + - "${DATAPREP_PORT:-6007}:5000" + environment: + <<: *common-env + DATAPREP_COMPONENT_NAME: ${DATAPREP_COMPONENT_NAME} + REDIS_URL_VECTOR: ${REDIS_URL_VECTOR} + REDIS_URL_KV: ${REDIS_URL_KV} + TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} + LLM_ENDPOINT: ${LLM_ENDPOINT} + LLM_MODEL: ${LLM_MODEL_ID} + HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN} + HF_TOKEN: ${HF_TOKEN} + LOGFLAG: true + worker-finqa-agent: image: opea/agent:latest container_name: finqa-agent-endpoint volumes: - ${TOOLSET_PATH}:/home/user/tools/ - ${PROMPT_PATH}:/home/user/prompts/ + ipc: host ports: - "9095:9095" - ipc: host environment: - ip_address: ${ip_address} - strategy: react_llama - with_memory: false - recursion_limit: ${recursion_limit_worker} - llm_engine: vllm - HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} - llm_endpoint_url: ${LLM_ENDPOINT_URL} - model: ${LLM_MODEL_ID} + <<: *common-agent-env + with_memory: "false" + recursion_limit: ${RECURSION_LIMIT_WORKER} temperature: ${TEMPERATURE} max_new_tokens: ${MAX_TOKENS} - stream: false + stream: "false" tools: /home/user/tools/finqa_agent_tools.yaml custom_prompt: /home/user/prompts/finqa_prompt.py - require_human_feedback: false - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} - REDIS_URL_VECTOR: $REDIS_URL_VECTOR - REDIS_URL_KV: $REDIS_URL_KV - TEI_EMBEDDING_ENDPOINT: $TEI_EMBEDDING_ENDPOINT port: 9095 worker-research-agent: @@ -40,67 +150,20 @@ services: volumes: - ${TOOLSET_PATH}:/home/user/tools/ - ${PROMPT_PATH}:/home/user/prompts/ + ipc: host ports: - "9096:9096" - ipc: host environment: - ip_address: ${ip_address} - strategy: react_llama - with_memory: false - recursion_limit: 25 - llm_engine: vllm - HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} - llm_endpoint_url: ${LLM_ENDPOINT_URL} - model: ${LLM_MODEL_ID} - stream: false + <<: *common-agent-env + with_memory: "false" + recursion_limit: ${RECURSION_LIMIT_WORKER} + stream: "false" tools: /home/user/tools/research_agent_tools.yaml custom_prompt: /home/user/prompts/research_prompt.py - require_human_feedback: false - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} FINNHUB_API_KEY: ${FINNHUB_API_KEY} FINANCIAL_DATASETS_API_KEY: ${FINANCIAL_DATASETS_API_KEY} port: 9096 - supervisor-react-agent: - image: opea/agent:latest - container_name: supervisor-agent-endpoint - depends_on: - - worker-finqa-agent - - worker-research-agent - volumes: - - ${TOOLSET_PATH}:/home/user/tools/ - - ${PROMPT_PATH}:/home/user/prompts/ - ports: - - "9090:9090" - ipc: host - environment: - ip_address: ${ip_address} - strategy: react_llama - with_memory: true - recursion_limit: ${recursion_limit_supervisor} - llm_engine: vllm - HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} - llm_endpoint_url: ${LLM_ENDPOINT_URL} - model: ${LLM_MODEL_ID} - temperature: ${TEMPERATURE} - max_new_tokens: ${MAX_TOKENS} - stream: true - tools: /home/user/tools/supervisor_agent_tools.yaml - custom_prompt: /home/user/prompts/supervisor_prompt.py - require_human_feedback: false - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} - WORKER_FINQA_AGENT_URL: $WORKER_FINQA_AGENT_URL - WORKER_RESEARCH_AGENT_URL: $WORKER_RESEARCH_AGENT_URL - DOCSUM_ENDPOINT: $DOCSUM_ENDPOINT - REDIS_URL_VECTOR: $REDIS_URL_VECTOR - REDIS_URL_KV: $REDIS_URL_KV - TEI_EMBEDDING_ENDPOINT: $TEI_EMBEDDING_ENDPOINT - port: 9090 - docsum-vllm-gaudi: image: opea/llm-docsum:latest container_name: docsum-vllm-gaudi @@ -108,9 +171,7 @@ services: - ${DOCSUM_PORT:-9000}:9000 ipc: host environment: - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} + <<: *common-env LLM_ENDPOINT: ${LLM_ENDPOINT} LLM_MODEL_ID: ${LLM_MODEL_ID} HF_TOKEN: ${HF_TOKEN} @@ -120,14 +181,40 @@ services: DocSum_COMPONENT_NAME: ${DocSum_COMPONENT_NAME:-OpeaDocSumvLLM} restart: unless-stopped + supervisor-react-agent: + image: opea/agent:latest + container_name: supervisor-agent-endpoint + volumes: + - ${TOOLSET_PATH}:/home/user/tools/ + - ${PROMPT_PATH}:/home/user/prompts/ + ipc: host + depends_on: + - worker-finqa-agent + - worker-research-agent + ports: + - "9090:9090" + environment: + <<: *common-agent-env + with_memory: "true" + recursion_limit: ${RECURSION_LIMIT_SUPERVISOR} + temperature: ${TEMPERATURE} + max_new_tokens: ${MAX_TOKENS} + stream: "true" + tools: /home/user/tools/supervisor_agent_tools.yaml + custom_prompt: /home/user/prompts/supervisor_prompt.py + WORKER_FINQA_AGENT_URL: ${WORKER_FINQA_AGENT_URL} + WORKER_RESEARCH_AGENT_URL: ${WORKER_RESEARCH_AGENT_URL} + DOCSUM_ENDPOINT: ${DOCSUM_ENDPOINT} + port: 9090 + agent-ui: image: opea/agent-ui:latest container_name: agent-ui environment: - host_ip: ${host_ip} - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} + <<: *common-env + host_ip: ${HOST_IP} ports: - "5175:8080" ipc: host + + diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/dataprep_compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/dataprep_compose.yaml deleted file mode 100644 index 5e4333c7d2..0000000000 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/dataprep_compose.yaml +++ /dev/null @@ -1,82 +0,0 @@ -# Copyright (C) 2025 Intel Corporation -# SPDX-License-Identifier: Apache-2.0 - -services: - tei-embedding-serving: - image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 - container_name: tei-embedding-serving - entrypoint: /bin/sh -c "apt-get update && apt-get install -y curl && text-embeddings-router --json-output --model-id ${EMBEDDING_MODEL_ID} --auto-truncate" - ports: - - "${TEI_EMBEDDER_PORT:-10221}:80" - volumes: - - "./data:/data" - shm_size: 1g - environment: - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} - host_ip: ${host_ip} - HF_TOKEN: ${HF_TOKEN} - healthcheck: - test: ["CMD", "curl", "-f", "http://${host_ip}:${TEI_EMBEDDER_PORT}/health"] - interval: 10s - timeout: 6s - retries: 48 - - redis-vector-db: - image: redis/redis-stack:7.2.0-v9 - container_name: redis-vector-db - ports: - - "${REDIS_PORT1:-6379}:6379" - - "${REDIS_PORT2:-8001}:8001" - environment: - - no_proxy=${no_proxy} - - http_proxy=${http_proxy} - - https_proxy=${https_proxy} - healthcheck: - test: ["CMD", "redis-cli", "ping"] - timeout: 10s - retries: 3 - start_period: 10s - - redis-kv-store: - image: redis/redis-stack:7.2.0-v9 - container_name: redis-kv-store - ports: - - "${REDIS_PORT3:-6380}:6379" - - "${REDIS_PORT4:-8002}:8001" - environment: - - no_proxy=${no_proxy} - - http_proxy=${http_proxy} - - https_proxy=${https_proxy} - healthcheck: - test: ["CMD", "redis-cli", "ping"] - timeout: 10s - retries: 3 - start_period: 10s - - dataprep-redis-finance: - image: ${REGISTRY:-opea}/dataprep:${TAG:-latest} - container_name: dataprep-redis-server-finance - depends_on: - redis-vector-db: - condition: service_healthy - redis-kv-store: - condition: service_healthy - tei-embedding-serving: - condition: service_healthy - ports: - - "${DATAPREP_PORT:-6007}:5000" - environment: - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} - DATAPREP_COMPONENT_NAME: ${DATAPREP_COMPONENT_NAME} - REDIS_URL_VECTOR: ${REDIS_URL_VECTOR} - REDIS_URL_KV: ${REDIS_URL_KV} - TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} - LLM_ENDPOINT: ${LLM_ENDPOINT} - LLM_MODEL: ${LLM_MODEL} - HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN} - HF_TOKEN: ${HF_TOKEN} - LOGFLAG: true diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_agents.sh b/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_agents.sh deleted file mode 100644 index 55dcbb7d3d..0000000000 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_agents.sh +++ /dev/null @@ -1,36 +0,0 @@ - -# Copyright (C) 2025 Intel Corporation -# SPDX-License-Identifier: Apache-2.0 - -export ip_address=$(hostname -I | awk '{print $1}') -export HUGGINGFACEHUB_API_TOKEN=${HF_TOKEN} -export TOOLSET_PATH=$WORKDIR/GenAIExamples/FinanceAgent/tools/ -echo "TOOLSET_PATH=${TOOLSET_PATH}" -export PROMPT_PATH=$WORKDIR/GenAIExamples/FinanceAgent/prompts/ -echo "PROMPT_PATH=${PROMPT_PATH}" -export recursion_limit_worker=12 -export recursion_limit_supervisor=10 - -vllm_port=8086 -export LLM_MODEL_ID="meta-llama/Llama-3.3-70B-Instruct" -export LLM_ENDPOINT_URL="http://${ip_address}:${vllm_port}" -export TEMPERATURE=0.5 -export MAX_TOKENS=4096 - -export WORKER_FINQA_AGENT_URL="http://${ip_address}:9095/v1/chat/completions" -export WORKER_RESEARCH_AGENT_URL="http://${ip_address}:9096/v1/chat/completions" - -export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" -export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:10221" -export REDIS_URL_VECTOR="redis://${ip_address}:6379" -export REDIS_URL_KV="redis://${ip_address}:6380" - -export MAX_INPUT_TOKENS=2048 -export MAX_TOTAL_TOKENS=4096 -export DocSum_COMPONENT_NAME="OpeaDocSumvLLM" -export DOCSUM_ENDPOINT="http://${ip_address}:9000/v1/docsum" - -export FINNHUB_API_KEY=${FINNHUB_API_KEY} -export FINANCIAL_DATASETS_API_KEY=${FINANCIAL_DATASETS_API_KEY} - -docker compose -f compose.yaml up -d diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_dataprep.sh b/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_dataprep.sh deleted file mode 100644 index 9bb006c191..0000000000 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_dataprep.sh +++ /dev/null @@ -1,15 +0,0 @@ -# Copyright (C) 2025 Intel Corporation -# SPDX-License-Identifier: Apache-2.0 - -export host_ip=${ip_address} -export DATAPREP_PORT="6007" -export TEI_EMBEDDER_PORT="10221" -export REDIS_URL_VECTOR="redis://${ip_address}:6379" -export REDIS_URL_KV="redis://${ip_address}:6380" -export LLM_MODEL=$model -export LLM_ENDPOINT="http://${ip_address}:${vllm_port}" -export DATAPREP_COMPONENT_NAME="OPEA_DATAPREP_REDIS_FINANCE" -export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" -export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:${TEI_EMBEDDER_PORT}" - -docker compose -f dataprep_compose.yaml up -d diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_vllm.sh b/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_vllm.sh deleted file mode 100644 index 5d8d58641b..0000000000 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_vllm.sh +++ /dev/null @@ -1,7 +0,0 @@ -# Copyright (C) 2025 Intel Corporation -# SPDX-License-Identifier: Apache-2.0 - -export LLM_MODEL_ID="meta-llama/Llama-3.3-70B-Instruct" -export MAX_LEN=16384 - -docker compose -f vllm_compose.yaml up -d diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/vllm_compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/vllm_compose.yaml deleted file mode 100644 index 8ca62e1e46..0000000000 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/vllm_compose.yaml +++ /dev/null @@ -1,35 +0,0 @@ - -# Copyright (C) 2025 Intel Corporation -# SPDX-License-Identifier: Apache-2.0 - -services: - vllm-service: - image: ${REGISTRY:-opea}/vllm-gaudi:${TAG:-latest} - container_name: vllm-gaudi-server - ports: - - "8086:8000" - volumes: - - ${HF_CACHE_DIR}:/data - environment: - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} - HF_TOKEN: ${HF_TOKEN} - HUGGING_FACE_HUB_TOKEN: ${HF_TOKEN} - HF_HOME: /data - HABANA_VISIBLE_DEVICES: all - OMPI_MCA_btl_vader_single_copy_mechanism: none - LLM_MODEL_ID: ${LLM_MODEL_ID} - VLLM_TORCH_PROFILER_DIR: "/mnt" - VLLM_SKIP_WARMUP: true - PT_HPU_ENABLE_LAZY_COLLECTIVES: true - healthcheck: - test: ["CMD-SHELL", "curl -f http://$host_ip:8086/health || exit 1"] - interval: 10s - timeout: 10s - retries: 100 - runtime: habana - cap_add: - - SYS_NICE - ipc: host - command: --model $LLM_MODEL_ID --tensor-parallel-size 4 --host 0.0.0.0 --port 8000 --max-seq-len-to-capture $MAX_LEN diff --git a/FinanceAgent/docker_compose/intel/set_env.sh b/FinanceAgent/docker_compose/intel/set_env.sh new file mode 100644 index 0000000000..9576cc5547 --- /dev/null +++ b/FinanceAgent/docker_compose/intel/set_env.sh @@ -0,0 +1,99 @@ +#!/usr/bin/env bash + +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +# Navigate to the parent directory and source the environment +pushd "../../" > /dev/null +source .set_env.sh +popd > /dev/null + +# Function to check if a variable is set +check_var() { + local var_name="$1" + local var_value="${!var_name}" + if [ -z "${var_value}" ]; then + echo "Error: ${var_name} is not set. Please set ${var_name}." + return 1 # Return an error code but do not exit the script + fi +} + +# Set IP address +export IP_ADDRESS=$(hostname -I | awk '{print $1}') +export HOST_IP="${IP_ADDRESS}" + +# Check critical variables +check_var "HF_TOKEN" +check_var "HOST_IP" + +# Proxy settings +export NO_PROXY="${NO_PROXY},${HOST_IP}" +export HTTP_PROXY="${http_proxy}" +export HTTPS_PROXY="${https_proxy}" + +# VLLM configuration +export VLLM_PORT="${VLLM_PORT:-8086}" +export VLLM_VOLUME="${VLLM_VOLUME:-/data2/huggingface}" +export VLLM_IMAGE="${VLLM_IMAGE:-opea/vllm-gaudi:latest}" +export LLM_MODEL_ID="${LLM_MODEL_ID:-meta-llama/Llama-3.3-70B-Instruct}" +export LLM_ENDPOINT="http://${IP_ADDRESS}:${VLLM_PORT}" +export MAX_LEN="${MAX_LEN:-16384}" +export NUM_CARDS="${NUM_CARDS:-4}" +export HF_CACHE_DIR="${HF_CACHE_DIR:-"./data"}" + +# Data preparation and embedding configuration +export DATAPREP_PORT="${DATAPREP_PORT:-6007}" +export TEI_EMBEDDER_PORT="${TEI_EMBEDDER_PORT:-10221}" +export REDIS_URL_VECTOR="redis://${IP_ADDRESS}:6379" +export REDIS_URL_KV="redis://${IP_ADDRESS}:6380" +export DATAPREP_COMPONENT_NAME="${DATAPREP_COMPONENT_NAME:-OPEA_DATAPREP_REDIS_FINANCE}" +export EMBEDDING_MODEL_ID="${EMBEDDING_MODEL_ID:-BAAI/bge-base-en-v1.5}" +export TEI_EMBEDDING_ENDPOINT="http://${IP_ADDRESS}:${TEI_EMBEDDER_PORT}" + +# Hugging Face API token +export HUGGINGFACEHUB_API_TOKEN="${HF_TOKEN}" + +# Recursion limits +export RECURSION_LIMIT_WORKER="${RECURSION_LIMIT_WORKER:-12}" +export RECURSION_LIMIT_SUPERVISOR="${RECURSION_LIMIT_SUPERVISOR:-10}" + +# LLM configuration +export TEMPERATURE="${TEMPERATURE:-0.5}" +export MAX_TOKENS="${MAX_TOKENS:-4096}" +export MAX_INPUT_TOKENS="${MAX_INPUT_TOKENS:-2048}" +export MAX_TOTAL_TOKENS="${MAX_TOTAL_TOKENS:-4096}" + +# Worker URLs +export WORKER_FINQA_AGENT_URL="http://${IP_ADDRESS}:9095/v1/chat/completions" +export WORKER_RESEARCH_AGENT_URL="http://${IP_ADDRESS}:9096/v1/chat/completions" + +# DocSum configuration +export DOCSUM_COMPONENT_NAME="${DOCSUM_COMPONENT_NAME:-"OpeaDocSumvLLM"}" +export DOCSUM_ENDPOINT="http://${IP_ADDRESS}:9000/v1/docsum" + +# API keys +check_var "FINNHUB_API_KEY" +check_var "FINANCIAL_DATASETS_API_KEY" +export FINNHUB_API_KEY="${FINNHUB_API_KEY}" +export FINANCIAL_DATASETS_API_KEY="${FINANCIAL_DATASETS_API_KEY}" + + +# Toolset and prompt paths +if check_var "WORKDIR"; then + export TOOLSET_PATH=$WORKDIR/GenAIExamples/FinanceAgent/tools/ + export PROMPT_PATH=$WORKDIR/GenAIExamples/FinanceAgent/prompts/ + + echo "TOOLSET_PATH=${TOOLSET_PATH}" + echo "PROMPT_PATH=${PROMPT_PATH}" + + # Array of directories to check + REQUIRED_DIRS=("${TOOLSET_PATH}" "${PROMPT_PATH}") + + for dir in "${REQUIRED_DIRS[@]}"; do + if [ ! -d "${dir}" ]; then + echo "Error: Required directory does not exist: ${dir}" + exit 1 + fi + done +fi + diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index 0f42813978..cc6d5944ed 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -6,29 +6,28 @@ set -xe export WORKPATH=$(dirname "$PWD") export WORKDIR=$WORKPATH/../../ echo "WORKDIR=${WORKDIR}" -export ip_address=$(hostname -I | awk '{print $1}') +export IP_ADDRESS=$(hostname -I | awk '{print $1}') LOG_PATH=$WORKPATH #### env vars for LLM endpoint ############# -model=meta-llama/Llama-3.3-70B-Instruct -vllm_image=opea/vllm-gaudi:latest -vllm_port=8086 -vllm_image=$vllm_image +MODEL=meta-llama/Llama-3.3-70B-Instruct +VLLM_IMAGE=opea/vllm-gaudi:latest +VLLM_PORT=8086 HF_CACHE_DIR=${model_cache:-"/data2/huggingface"} -vllm_volume=${HF_CACHE_DIR} +VLLM_VOLUME=${HF_CACHE_DIR} ####################################### #### env vars for dataprep ############# -export host_ip=${ip_address} +export hOST_IP=${IP_ADDRESS} export DATAPREP_PORT="6007" export TEI_EMBEDDER_PORT="10221" -export REDIS_URL_VECTOR="redis://${ip_address}:6379" -export REDIS_URL_KV="redis://${ip_address}:6380" -export LLM_MODEL=$model -export LLM_ENDPOINT="http://${ip_address}:${vllm_port}" +export REDIS_URL_VECTOR="redis://${IP_ADDRESS}:6379" +export REDIS_URL_KV="redis://${IP_ADDRESS}:6380" +export LLM_MODEL=$MODEL +export LLM_ENDPOINT="http://${IP_ADDRESS}:${VLLM_PORT}" export DATAPREP_COMPONENT_NAME="OPEA_DATAPREP_REDIS_FINANCE" export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" -export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:${TEI_EMBEDDER_PORT}" +export TEI_EMBEDDING_ENDPOINT="http://${IP_ADDRESS}:${TEI_EMBEDDER_PORT}" ####################################### @@ -62,12 +61,12 @@ function build_vllm_docker_image() { VLLM_FORK_VER=v0.6.6.post1+Gaudi-1.20.0 git checkout ${VLLM_FORK_VER} &> /dev/null - docker build --no-cache -f Dockerfile.hpu -t $vllm_image --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy + docker build --no-cache -f Dockerfile.hpu -t $VLLM_IMAGE --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy if [ $? -ne 0 ]; then - echo "$vllm_image failed" + echo "$VLLM_IMAGE failed" exit 1 else - echo "$vllm_image successful" + echo "$VLLM_IMAGE successful" fi } @@ -75,8 +74,8 @@ function build_vllm_docker_image() { function start_vllm_service_70B() { echo "token is ${HF_TOKEN}" echo "start vllm gaudi service" - echo "**************model is $model**************" - docker run -d --runtime=habana --rm --name "vllm-gaudi-server" -e HABANA_VISIBLE_DEVICES=all -p $vllm_port:8000 -v $vllm_volume:/data -e HF_TOKEN=$HF_TOKEN -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e HF_HOME=/data -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e VLLM_SKIP_WARMUP=true --cap-add=sys_nice --ipc=host $vllm_image --model ${model} --max-seq-len-to-capture 16384 --tensor-parallel-size 4 + echo "**************MODEL is $MODEL**************" + docker run -d --runtime=habana --rm --name "vllm-gaudi-server" -e HABANA_VISIBLE_DEVICES=all -p $VLLM_PORT:8000 -v $VLLM_VOLUME:/data -e HF_TOKEN=$HF_TOKEN -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e HF_HOME=/data -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e VLLM_SKIP_WARMUP=true --cap-add=sys_nice --ipc=host $VLLM_IMAGE --model ${MODEL} --max-seq-len-to-capture 16384 --tensor-parallel-size 4 sleep 10s echo "Waiting vllm gaudi ready" n=0 @@ -104,8 +103,8 @@ function stop_llm(){ } -function start_dataprep(){ - docker compose -f $WORKPATH/docker_compose/intel/hpu/gaudi/dataprep_compose.yaml up -d +function start_dataprep_and_agent(){ + docker compose -f $WORKPATH/docker_compose/intel/hpu/gaudi/compose.yaml up -d tei-embedding-serving redis-vector-db redis-kv-store dataprep-redis-finance worker-finqa-agent worker-research-agent docsum-vllm-gaudi supervisor-react-agent agent-ui sleep 1m } @@ -155,14 +154,6 @@ function stop_dataprep() { } -function start_agents() { - echo "Starting Agent services" - cd $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi/ - bash launch_agents.sh - sleep 2m -} - - function validate_agent_service() { # # test worker finqa agent echo "======================Testing worker finqa agent======================" @@ -249,21 +240,23 @@ echo "=================== #2 Start vllm endpoint====================" start_vllm_service_70B echo "=================== #2 vllm endpoint started====================" -echo "=================== #3 Start dataprep and ingest data ====================" -start_dataprep +echo "=================== #3 Start data and agent services ====================" +start_dataprep_and_agent +echo "=================== #3 data and agent endpoint started====================" + +echo "=================== #4 Validate ingest_validate_dataprep ====================" ingest_validate_dataprep -echo "=================== #3 Data ingestion and validation completed====================" +echo "=================== #4 Data ingestion and validation completed====================" -echo "=================== #4 Start agents ====================" -start_agents +echo "=================== #5 Start agents ====================" validate_agent_service -echo "=================== #4 Agent test passed ====================" +echo "=================== #5 Agent test passed ====================" -echo "=================== #5 Stop microservices ====================" +echo "=================== #6 Stop microservices ====================" stop_agent_docker stop_dataprep stop_llm -echo "=================== #5 Microservices stopped====================" +echo "=================== #6 Microservices stopped====================" echo y | docker system prune From f54e7f0e06da47f7ba5cf73e558e97d4566b534e Mon Sep 17 00:00:00 2001 From: Mustafa Date: Sat, 24 May 2025 01:51:28 +0000 Subject: [PATCH 02/13] update README Signed-off-by: Mustafa --- FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml index c31c92a690..72e54a9eb9 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml @@ -178,7 +178,7 @@ services: LOGFLAG: ${LOGFLAG:-False} MAX_INPUT_TOKENS: ${MAX_INPUT_TOKENS} MAX_TOTAL_TOKENS: ${MAX_TOTAL_TOKENS} - DocSum_COMPONENT_NAME: ${DocSum_COMPONENT_NAME:-OpeaDocSumvLLM} + DocSum_COMPONENT_NAME: ${DOCSUM_COMPONENT_NAME:-OpeaDocSumvLLM} restart: unless-stopped supervisor-react-agent: From fe5baf852c9bfb62398b5a9c1bfb6fc3be3e53ea Mon Sep 17 00:00:00 2001 From: Mustafa Date: Sat, 24 May 2025 02:17:54 +0000 Subject: [PATCH 03/13] update compose.yaml Signed-off-by: Mustafa --- FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml index 72e54a9eb9..85e2393681 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml @@ -29,7 +29,7 @@ services: ports: - "8086:8000" volumes: - - ${HF_CACHE_DIR}:/data + - ${HF_CACHE_DIR:-./data}:/data environment: <<: *common-env HF_TOKEN: ${HF_TOKEN} @@ -59,8 +59,7 @@ services: ports: - "${TEI_EMBEDDER_PORT:-10221}:80" volumes: - # - "./data:/data" - - ${HF_CACHE_DIR}:/data + - ${HF_CACHE_DIR:-./data}:/data shm_size: 1g environment: <<: *common-env From ea35bedaed4f1212ba37a90ccad42e0881c0221e Mon Sep 17 00:00:00 2001 From: Mustafa Date: Tue, 27 May 2025 20:38:45 +0000 Subject: [PATCH 04/13] update test variables Signed-off-by: Mustafa --- .../intel/hpu/gaudi/compose.yaml | 12 +-- FinanceAgent/tests/test_compose_on_gaudi.sh | 76 ++++++++++++++----- 2 files changed, 64 insertions(+), 24 deletions(-) diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml index 85e2393681..8f77a5ab49 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml @@ -19,7 +19,7 @@ x-common-agent-environment: TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} ip_address: ${IP_ADDRESS} strategy: react_llama - require_human_feedback: "false" + require_human_feedback: false services: @@ -118,7 +118,7 @@ services: REDIS_URL_KV: ${REDIS_URL_KV} TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} LLM_ENDPOINT: ${LLM_ENDPOINT} - LLM_MODEL: ${LLM_MODEL_ID} + LLM_MODEL: ${LLM_MODEL_ID} HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN} HF_TOKEN: ${HF_TOKEN} LOGFLAG: true @@ -134,11 +134,11 @@ services: - "9095:9095" environment: <<: *common-agent-env - with_memory: "false" + with_memory: false recursion_limit: ${RECURSION_LIMIT_WORKER} temperature: ${TEMPERATURE} max_new_tokens: ${MAX_TOKENS} - stream: "false" + stream: false tools: /home/user/tools/finqa_agent_tools.yaml custom_prompt: /home/user/prompts/finqa_prompt.py port: 9095 @@ -154,9 +154,9 @@ services: - "9096:9096" environment: <<: *common-agent-env - with_memory: "false" + with_memory: false recursion_limit: ${RECURSION_LIMIT_WORKER} - stream: "false" + stream: false tools: /home/user/tools/research_agent_tools.yaml custom_prompt: /home/user/prompts/research_prompt.py FINNHUB_API_KEY: ${FINNHUB_API_KEY} diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index cc6d5944ed..6cca6c66aa 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -7,31 +7,64 @@ export WORKPATH=$(dirname "$PWD") export WORKDIR=$WORKPATH/../../ echo "WORKDIR=${WORKDIR}" export IP_ADDRESS=$(hostname -I | awk '{print $1}') +export HOST_IP=${IP_ADDRESS} LOG_PATH=$WORKPATH -#### env vars for LLM endpoint ############# +# Proxy settings +export NO_PROXY="${NO_PROXY},${HOST_IP}" +export HTTP_PROXY="${http_proxy}" +export HTTPS_PROXY="${https_proxy}" + +# VLLM configuration MODEL=meta-llama/Llama-3.3-70B-Instruct -VLLM_IMAGE=opea/vllm-gaudi:latest -VLLM_PORT=8086 -HF_CACHE_DIR=${model_cache:-"/data2/huggingface"} -VLLM_VOLUME=${HF_CACHE_DIR} -####################################### +export VLLM_PORT="${VLLM_PORT:-8086}" + +# export HF_CACHE_DIR="${HF_CACHE_DIR:-"./data"}" +export HF_CACHE_DIR=${model_cache:-"./data2/huggingface"} +export VLLM_VOLUME="${HF_CACHE_DIR:-"./data2/huggingface"}" +export VLLM_IMAGE="${VLLM_IMAGE:-opea/vllm-gaudi:latest}" +export LLM_MODEL_ID="${LLM_MODEL_ID:-meta-llama/Llama-3.3-70B-Instruct}" +export LLM_MODEL=$LLM_MODEL_ID +export LLM_ENDPOINT="http://${IP_ADDRESS}:${VLLM_PORT}" +export MAX_LEN="${MAX_LEN:-16384}" +export NUM_CARDS="${NUM_CARDS:-4}" + +# Recursion limits +export RECURSION_LIMIT_WORKER="${RECURSION_LIMIT_WORKER:-12}" +export RECURSION_LIMIT_SUPERVISOR="${RECURSION_LIMIT_SUPERVISOR:-10}" + +# Hugging Face API token +export HUGGINGFACEHUB_API_TOKEN="${HF_TOKEN}" + +# LLM configuration +export TEMPERATURE="${TEMPERATURE:-0.5}" +export MAX_TOKENS="${MAX_TOKENS:-4096}" +export MAX_INPUT_TOKENS="${MAX_INPUT_TOKENS:-2048}" +export MAX_TOTAL_TOKENS="${MAX_TOTAL_TOKENS:-4096}" + +# Worker URLs +export WORKER_FINQA_AGENT_URL="http://${IP_ADDRESS}:9095/v1/chat/completions" +export WORKER_RESEARCH_AGENT_URL="http://${IP_ADDRESS}:9096/v1/chat/completions" + +# DocSum configuration +export DOCSUM_COMPONENT_NAME="${DOCSUM_COMPONENT_NAME:-"OpeaDocSumvLLM"}" +export DOCSUM_ENDPOINT="http://${IP_ADDRESS}:9000/v1/docsum" + +# Toolset and prompt paths +export TOOLSET_PATH=$WORKDIR/GenAIExamples/FinanceAgent/tools/ +export PROMPT_PATH=$WORKDIR/GenAIExamples/FinanceAgent/prompts/ #### env vars for dataprep ############# -export hOST_IP=${IP_ADDRESS} export DATAPREP_PORT="6007" export TEI_EMBEDDER_PORT="10221" export REDIS_URL_VECTOR="redis://${IP_ADDRESS}:6379" export REDIS_URL_KV="redis://${IP_ADDRESS}:6380" -export LLM_MODEL=$MODEL -export LLM_ENDPOINT="http://${IP_ADDRESS}:${VLLM_PORT}" + export DATAPREP_COMPONENT_NAME="OPEA_DATAPREP_REDIS_FINANCE" export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" export TEI_EMBEDDING_ENDPOINT="http://${IP_ADDRESS}:${TEI_EMBEDDER_PORT}" ####################################### - - function get_genai_comps() { if [ ! -d "GenAIComps" ] ; then git clone --depth 1 --branch ${opea_branch:-"main"} https://github.com/opea-project/GenAIComps.git @@ -70,11 +103,10 @@ function build_vllm_docker_image() { fi } - function start_vllm_service_70B() { echo "token is ${HF_TOKEN}" echo "start vllm gaudi service" - echo "**************MODEL is $MODEL**************" + echo "**************MODEL is $LLM_MODEL_ID**************" docker run -d --runtime=habana --rm --name "vllm-gaudi-server" -e HABANA_VISIBLE_DEVICES=all -p $VLLM_PORT:8000 -v $VLLM_VOLUME:/data -e HF_TOKEN=$HF_TOKEN -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e HF_HOME=/data -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e VLLM_SKIP_WARMUP=true --cap-add=sys_nice --ipc=host $VLLM_IMAGE --model ${MODEL} --max-seq-len-to-capture 16384 --tensor-parallel-size 4 sleep 10s echo "Waiting vllm gaudi ready" @@ -95,7 +127,6 @@ function start_vllm_service_70B() { echo "Service started successfully" } - function stop_llm(){ cid=$(docker ps -aq --filter "name=vllm-gaudi-server") echo "Stopping container $cid" @@ -104,7 +135,17 @@ function stop_llm(){ } function start_dataprep_and_agent(){ - docker compose -f $WORKPATH/docker_compose/intel/hpu/gaudi/compose.yaml up -d tei-embedding-serving redis-vector-db redis-kv-store dataprep-redis-finance worker-finqa-agent worker-research-agent docsum-vllm-gaudi supervisor-react-agent agent-ui + docker compose -f $WORKPATH/docker_compose/intel/hpu/gaudi/compose.yaml up -d \ + tei-embedding-serving \ + redis-vector-db \ + redis-kv-store \ + dataprep-redis-finance \ + worker-finqa-agent \ + worker-research-agent \ + docsum-vllm-gaudi \ + supervisor-react-agent \ + agent-ui + sleep 1m } @@ -219,7 +260,6 @@ function stop_agent_docker() { done } - echo "workpath: $WORKPATH" echo "=================== Stop containers ====================" stop_llm @@ -232,9 +272,9 @@ echo "=================== #1 Building docker images====================" build_vllm_docker_image build_dataprep_agent_images -#### for local test +### for local test # build_agent_image_local -# echo "=================== #1 Building docker images completed====================" +echo "=================== #1 Building docker images completed====================" echo "=================== #2 Start vllm endpoint====================" start_vllm_service_70B From cce90318583b2c7b575ce716e493043fce8fd657 Mon Sep 17 00:00:00 2001 From: Mustafa Date: Wed, 28 May 2025 17:50:25 +0000 Subject: [PATCH 05/13] minor readme changes Signed-off-by: Mustafa --- FinanceAgent/README.md | 2 +- FinanceAgent/docker_compose/intel/hpu/gaudi/README.md | 10 +++++----- FinanceAgent/tests/test_compose_on_gaudi.sh | 5 +++-- 3 files changed, 9 insertions(+), 8 deletions(-) diff --git a/FinanceAgent/README.md b/FinanceAgent/README.md index bb464579fa..26400c8d39 100644 --- a/FinanceAgent/README.md +++ b/FinanceAgent/README.md @@ -36,7 +36,7 @@ The user interacts with the supervisor agent through the graphical UI. The super ![Finance Agent Architecture](assets/finance_agent_arch.png) -### OPEA Microservices Diagram +### OPEA Microservices Diagram for Data Handling The architectural diagram of the `dataprep` microservice is shown below. We use [docling](https://github.com/docling-project/docling) to extract text from PDFs and URLs into markdown format. Both the full document content and tables are extracted. We then use an LLM to extract metadata from the document, including the company name, year, quarter, document type, and document title. The full document markdown then gets chunked, and LLM is used to summarize each chunk, and the summaries are embedded and saved to a vector database. Each table is also summarized by LLM and the summaries are embedded and saved to the vector database. The chunks and tables are also saved into a KV store. The pipeline is designed as such to improve retrieval accuracy of the `search_knowledge_base` tool used by the Question Answering worker agent. ![dataprep architecture](assets/fin_agent_dataprep.png) diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md index db2e123ad4..c005b2237b 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md @@ -151,7 +151,7 @@ Supervisor Agent multi turn: ## Accessing the User Interface (UI) The UI microservice is launched in the previous step with the other microservices. -To see the UI, open a web browser to `http://${ip_address}:5175` to access the UI. Note the `ip_address` here is the host IP of the UI microservice. +To see the UI, open a web browser to `http://${HOST_IP}:5175` to access the UI. Note the `HOST_IP` here is the host IP of the UI microservice. 1. Create Admin Account with a random value @@ -159,9 +159,9 @@ To see the UI, open a web browser to `http://${ip_address}:5175` to access the U First, click on the user icon in the upper right corner to open `Settings`. Click on `Admin Settings`. Click on `Connections`. - Then, enter the supervisor agent endpoint in the `OpenAI API` section: `http://${ip_address}:9090/v1`. Enter the API key as "empty". Add an arbitrary model id in `Model IDs`, for example, "opea_agent". The `ip_address` here should be the host ip of the agent microservice. + Then, enter the supervisor agent endpoint in the `OpenAI API` section: `http://${HOST_IP}:9090/v1`. Enter the API key as "empty". Add an arbitrary model id in `Model IDs`, for example, "opea_agent". The `HOST_IP` here should be the host ip of the agent microservice. - Then, enter the dataprep endpoint in the `Icloud File API` section. You first need to enable `Icloud File API` by clicking on the button on the right to turn it into green and then enter the endpoint url, for example, `http://${ip_address}:6007/v1`. The `ip_address` here should be the host ip of the dataprep microservice. + Then, enter the dataprep endpoint in the `Icloud File API` section. You first need to enable `Icloud File API` by clicking on the button on the right to turn it into green and then enter the endpoint url, for example, `http://${HOST_IP}:6007/v1`. The `HOST_IP` here should be the host ip of the dataprep microservice. You should see screen like the screenshot below when the settings are done. @@ -169,8 +169,8 @@ To see the UI, open a web browser to `http://${ip_address}:5175` to access the U 3. Upload documents with UI - Click on the `Workplace` icon in the top left corner. Click `Knowledge`. Click on the "+" sign to the right of `Icloud Knowledge`. You can paste an url in the left hand side of the pop-up window, or upload a local file by click on the cloud icon on the right hand side of the pop-up window. Then click on the `Upload Confirm` button. Wait till the processing is done and the pop-up window will be closed on its own when the data ingestion is done. See the screenshot below. - + Click on the `Workplace` icon in the top left corner. Click `Knowledge`. Click on the "+" sign to the right of `iCloud Knowledge`. You can paste an url in the left hand side of the pop-up window, or upload a local file by click on the cloud icon on the right hand side of the pop-up window. Then click on the `Upload Confirm` button. Wait till the processing is done and the pop-up window will be closed on its own when the data ingestion is done. See the screenshot below. + Then, enter the dataprep endpoint in the `iCloud File API` section. You first need to enable `iCloud File API` by clicking on the button on the right to turn it into green and then enter the endpoint url, for example, `http://${HOST_IP}:6007/v1`. The `HOST_IP` here should be the host ip of the dataprep microservice. Note: the data ingestion may take a few minutes depending on the length of the document. Please wait patiently and do not close the pop-up window. ![upload-doc-ui](../../../../assets/upload_doc_ui.png) diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index 6cca6c66aa..cd7ed1a84a 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -272,8 +272,9 @@ echo "=================== #1 Building docker images====================" build_vllm_docker_image build_dataprep_agent_images -### for local test -# build_agent_image_local +# ## for local test +# # build_agent_image_local + echo "=================== #1 Building docker images completed====================" echo "=================== #2 Start vllm endpoint====================" From 6dbe6358bb99e2ee1a93b797e8700807d1f5666a Mon Sep 17 00:00:00 2001 From: Mustafa Date: Wed, 28 May 2025 22:07:37 +0000 Subject: [PATCH 06/13] updates for readme and env Signed-off-by: Mustafa --- FinanceAgent/README.md | 17 +++++------ .../docker_compose/intel/hpu/gaudi/README.md | 29 +++++++++++++------ .../intel/hpu/gaudi/compose.yaml | 2 +- FinanceAgent/docker_compose/intel/set_env.sh | 23 +++++---------- 4 files changed, 35 insertions(+), 36 deletions(-) diff --git a/FinanceAgent/README.md b/FinanceAgent/README.md index 26400c8d39..ed28b1df65 100644 --- a/FinanceAgent/README.md +++ b/FinanceAgent/README.md @@ -6,25 +6,22 @@ - [Problem Motivation](#problem-motivation) - [Architecture](#architecture) - [High-Level Diagram](#high-level-diagram) - - [OPEA Microservices Diagram](#opea-microservices-diagram) + - [OPEA Microservices Diagram for Data Handling](#opea-microservices-diagram-for-data-handling) - [Deployment Options](#deployment-options) - [Contribution](#contribution) - ## Overview -The Finance Agent example showcases a hierarchical multi-agent system designed to assist users with financial document processing and analysis. It provides three main functionalities: summarizing lengthy financial documents, answering queries related to financial documents, and conducting research to generate investment reports on public companies. +The Finance Agent exemplifies a hierarchical multi-agent system designed to streamline financial document processing and analysis for users. It offers three core functionalities: summarizing lengthy financial documents, answering queries related to these documents, and conducting research to generate investment reports on public companies. -Users interact with the system via a graphical user interface (UI), where requests are managed by a supervisor agent that delegates tasks to worker agents or the summarization microservice. The system supports document uploads through the UI for processing. +Navigating and analyzing extensive financial documents can be both challenging and time-consuming. Users often need concise summaries, answers to specific queries, or comprehensive investment reports. The Finance Agent effectively addresses these needs by automating document summarization, query answering, and research tasks, thereby enhancing productivity and decision-making efficiency. +Users interact with the system through a graphical user interface (UI), where a supervisor agent manages requests by delegating tasks to worker agents or the summarization microservice. The system also supports document uploads via the UI for processing. -## Problem Motivation -Navigating and analyzing extensive financial documents can be challenging and time-consuming. Users often require concise summaries, answers to specific queries, or comprehensive investment reports. The Finance Agent addresses these needs by automating document summarization, query answering, and research tasks, thereby enhancing productivity and decision-making efficiency. ## Architecture ### High-Level Diagram -The Finance Agent system is structured as a hierarchical multi-agent architecture. User interactions are managed by a supervisor agent, which coordinates tasks among worker agents and the summarization microservice. The system supports document uploads and processing through the UI. The architecture of this Finance Agent example is shown in the figure below. The agent is a hierarchical multi-agent system and has 3 main functions: @@ -57,15 +54,15 @@ The Question Answering worker agent uses `search_knowledge_base` tool to get rel ## Deployment Options -This CodeGen example can be deployed manually on various hardware platforms using Docker Compose or Kubernetes. Select the appropriate guide based on your target environment: +This Finance Agent example can be deployed manually on Docker Compose. | Hardware | Deployment Mode | Guide Link | | :-------------- | :------------------- | :----------------------------------------------------------------------- | -| Intel Gaudi HPU | Single Node (Docker) | [Gaudi Docker Compose Guide](./docker_compose/intel/hpu/gaudi/README.md) | +| Intel® Gaudi® AI Accelerator | Single Node (Docker) | [Gaudi Docker Compose Guide](./docker_compose/intel/hpu/gaudi/README.md) | _Note: Building custom microservice images can be done using the resources in [GenAIComps](https://github.com/opea-project/GenAIComps)._ ## Contribution -We welcome contributions to the OPEA project. Please refer to the contribution guidelines for more information. +We welcome contributions to the OPEA project. Please refer to the [contribution guidelines](https://github.com/opea-project/docs/blob/main/community/CONTRIBUTING.md) for more information. diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md index c005b2237b..a7726d0e70 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md @@ -1,5 +1,5 @@ -# Deploy Finance Agent on Intel Gaudi HPU with Docker Compose -This README provides instructions for deploying the Finance Agent application using Docker Compose on systems equipped with Intel Gaudi HPUs. +# Deploy Finance Agent on Intel® Gaudi® AI Accelerator with Docker Compose +This README provides instructions for deploying the Finance Agent application using Docker Compose on systems equipped with Intel® Gaudi® AI Accelerators. ## Table of Contents @@ -11,11 +11,11 @@ This README provides instructions for deploying the Finance Agent application us ## Overview -This guide focuses on running the pre-configured Finance Agent service using Docker Compose on Intel Gaudi HPUs. It leverages containers optimized for Gaudi for the LLM serving component, along with CPU-based containers for other microservices like embedding, retrieval, data preparation and the UI. +This guide focuses on running the pre-configured Finance Agent service using Docker Compose on Intel® Gaudi® AI Accelerators. It leverages containers optimized for Gaudi for the LLM serving component, along with CPU-based containers for other microservices like embedding, retrieval, data preparation and the UI. ## Prerequisites - Docker and Docker Compose installed. -- Intel Gaudi HPU(s) with the necessary drivers and software stack installed on the host system. (Refer to Intel Gaudi Documentation). +- Intel® Gaudi® AI Accelerator(s) with the necessary drivers and software stack installed on the host system. (Refer to Intel Gaudi Documentation). - Git installed (for cloning repository). - Hugging Face Hub API Token (for downloading models). - Access to the internet (or a private model cache). @@ -48,10 +48,11 @@ Set required environment variables in your shell: # Optional: Configure HOST_IP if needed # Replace with your host's external IP address (do not use localhost or 127.0.0.1). # export HOST_IP=$(hostname -I | awk '{print $1}') + # Optional: Configure proxy if needed - # export http_proxy="your_http_proxy" - # export https_proxy="your_https_proxy" - # export no_proxy="localhost,127.0.0.1,${HOST_IP}" # Add other hosts if necessary + # export HTTP_PROXY="${http_proxy}" + # export HTTPS_PROXY="${https_proxy}" + # export NO_PROXY="${NO_PROXY},${HOST_IP}" source ../../set_env.sh ``` @@ -74,12 +75,12 @@ Below is the command to launch services ```shell - docker compose up -d + docker compose -f compose.yaml up -d ``` #### [Optional] Build docker images -Only needed when docker pull failed. +This is only needed if the Docker image is unavailable or the pull operation fails. ```bash cd $WORKDIR/GenAIExamples/FinanceAgent/docker_image_build @@ -105,9 +106,19 @@ If deploy on Gaudi, also need to build vllm image. ## Validate Services Wait several minutes for models to download and services to initialize (Gaudi initialization can take time). Check container logs (docker compose logs -f , especially vllm-gaudi-server). + ```bash docker logs --tail 2000 -f vllm-gaudi-server ``` +> Expected output of the `vllm-gaudi-server` service is +``` + INFO: Started server process [1] + INFO: Waiting for application startup. + INFO: Application startup complete. + INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) + INFO: : - "GET /health HTTP/1.1" 200 OK + +``` ### Validate Data Services Ingest data and retrieval from database diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml index 8f77a5ab49..1ecaecf77c 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml @@ -17,7 +17,7 @@ x-common-agent-environment: REDIS_URL_VECTOR: ${REDIS_URL_VECTOR} REDIS_URL_KV: ${REDIS_URL_KV} TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} - ip_address: ${IP_ADDRESS} + ip_address: ${HOST_IP} strategy: react_llama require_human_feedback: false diff --git a/FinanceAgent/docker_compose/intel/set_env.sh b/FinanceAgent/docker_compose/intel/set_env.sh index 9576cc5547..571efec0c2 100644 --- a/FinanceAgent/docker_compose/intel/set_env.sh +++ b/FinanceAgent/docker_compose/intel/set_env.sh @@ -18,25 +18,16 @@ check_var() { fi } -# Set IP address -export IP_ADDRESS=$(hostname -I | awk '{print $1}') -export HOST_IP="${IP_ADDRESS}" - # Check critical variables check_var "HF_TOKEN" check_var "HOST_IP" -# Proxy settings -export NO_PROXY="${NO_PROXY},${HOST_IP}" -export HTTP_PROXY="${http_proxy}" -export HTTPS_PROXY="${https_proxy}" - # VLLM configuration export VLLM_PORT="${VLLM_PORT:-8086}" export VLLM_VOLUME="${VLLM_VOLUME:-/data2/huggingface}" export VLLM_IMAGE="${VLLM_IMAGE:-opea/vllm-gaudi:latest}" export LLM_MODEL_ID="${LLM_MODEL_ID:-meta-llama/Llama-3.3-70B-Instruct}" -export LLM_ENDPOINT="http://${IP_ADDRESS}:${VLLM_PORT}" +export LLM_ENDPOINT="http://${HOST_IP}:${VLLM_PORT}" export MAX_LEN="${MAX_LEN:-16384}" export NUM_CARDS="${NUM_CARDS:-4}" export HF_CACHE_DIR="${HF_CACHE_DIR:-"./data"}" @@ -44,11 +35,11 @@ export HF_CACHE_DIR="${HF_CACHE_DIR:-"./data"}" # Data preparation and embedding configuration export DATAPREP_PORT="${DATAPREP_PORT:-6007}" export TEI_EMBEDDER_PORT="${TEI_EMBEDDER_PORT:-10221}" -export REDIS_URL_VECTOR="redis://${IP_ADDRESS}:6379" -export REDIS_URL_KV="redis://${IP_ADDRESS}:6380" +export REDIS_URL_VECTOR="redis://${HOST_IP}:6379" +export REDIS_URL_KV="redis://${HOST_IP}:6380" export DATAPREP_COMPONENT_NAME="${DATAPREP_COMPONENT_NAME:-OPEA_DATAPREP_REDIS_FINANCE}" export EMBEDDING_MODEL_ID="${EMBEDDING_MODEL_ID:-BAAI/bge-base-en-v1.5}" -export TEI_EMBEDDING_ENDPOINT="http://${IP_ADDRESS}:${TEI_EMBEDDER_PORT}" +export TEI_EMBEDDING_ENDPOINT="http://${HOST_IP}:${TEI_EMBEDDER_PORT}" # Hugging Face API token export HUGGINGFACEHUB_API_TOKEN="${HF_TOKEN}" @@ -64,12 +55,12 @@ export MAX_INPUT_TOKENS="${MAX_INPUT_TOKENS:-2048}" export MAX_TOTAL_TOKENS="${MAX_TOTAL_TOKENS:-4096}" # Worker URLs -export WORKER_FINQA_AGENT_URL="http://${IP_ADDRESS}:9095/v1/chat/completions" -export WORKER_RESEARCH_AGENT_URL="http://${IP_ADDRESS}:9096/v1/chat/completions" +export WORKER_FINQA_AGENT_URL="http://${HOST_IP}:9095/v1/chat/completions" +export WORKER_RESEARCH_AGENT_URL="http://${HOST_IP}:9096/v1/chat/completions" # DocSum configuration export DOCSUM_COMPONENT_NAME="${DOCSUM_COMPONENT_NAME:-"OpeaDocSumvLLM"}" -export DOCSUM_ENDPOINT="http://${IP_ADDRESS}:9000/v1/docsum" +export DOCSUM_ENDPOINT="http://${HOST_IP}:9000/v1/docsum" # API keys check_var "FINNHUB_API_KEY" From 99a315bdd8da16bab921ca14329e41152432ebcf Mon Sep 17 00:00:00 2001 From: Mustafa Date: Fri, 30 May 2025 16:44:17 +0000 Subject: [PATCH 07/13] update the readme file Signed-off-by: Mustafa --- .../docker_compose/intel/hpu/gaudi/README.md | 103 +++++++++--------- 1 file changed, 52 insertions(+), 51 deletions(-) diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md index a7726d0e70..592185475b 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md @@ -19,15 +19,17 @@ This guide focuses on running the pre-configured Finance Agent service using Doc - Git installed (for cloning repository). - Hugging Face Hub API Token (for downloading models). - Access to the internet (or a private model cache). +- Finnhub API Key. Go to https://docs.financialdatasets.ai/ to get your free api key +- Financial Datgasets API Key. Go to https://docs.financialdatasets.ai/ to get your free api key Clone the GenAIExamples repository: ```shell - mkdir /path/to/your/workspace/ - export WORKDIR=/path/to/your/workspace/ - cd $WORKDIR - git clone https://github.com/opea-project/GenAIExamples.git - cd GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi +mkdir /path/to/your/workspace/ +export WORKDIR=/path/to/your/workspace/ +cd $WORKDIR +git clone https://github.com/opea-project/GenAIExamples.git +cd GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi ``` ## Start Deployment @@ -36,25 +38,23 @@ This uses the default vLLM-based deployment profile (vllm-gaudi-server). Set required environment variables in your shell: ```shell - # Replace with your Hugging Face Hub API token - export HF_TOKEN="your_huggingface_token" - # Path to your model cache - export HF_CACHE_DIR="./data" - # Go to https://finnhub.io/ to get your free api key - export FINNHUB_API_KEY= - # Go to https://docs.financialdatasets.ai/ to get your free api key - export FINANCIAL_DATASETS_API_KEY= - - # Optional: Configure HOST_IP if needed - # Replace with your host's external IP address (do not use localhost or 127.0.0.1). - # export HOST_IP=$(hostname -I | awk '{print $1}') - - # Optional: Configure proxy if needed - # export HTTP_PROXY="${http_proxy}" - # export HTTPS_PROXY="${https_proxy}" - # export NO_PROXY="${NO_PROXY},${HOST_IP}" - - source ../../set_env.sh +# Path to your model cache +export HF_CACHE_DIR="./data" +# Some models from Hugging Face require approval beforehand. Ensure you have the necessary permissions to access them. +export HF_TOKEN="your_huggingface_token" +export FINNHUB_API_KEY="your-finnhub-api-key" +export FINANCIAL_DATASETS_API_KEY="your-financial-datgasets-api-key" + +# Optional: Configure HOST_IP if needed +# Replace with your host's external IP address (do not use localhost or 127.0.0.1). +# export HOST_IP=$(hostname -I | awk '{print $1}') + +# Optional: Configure proxy if needed +# export HTTP_PROXY="${http_proxy}" +# export HTTPS_PROXY="${https_proxy}" +# export NO_PROXY="${NO_PROXY},${HOST_IP}" + +source ../../set_env.sh ``` Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (LLM_SERVICE_PORT, TEI_EMBEDDER_PORT, etc.) are set if not using defaults from the compose file. For instance, edit the set_env.sh to change the LLM model: @@ -75,7 +75,7 @@ Below is the command to launch services ```shell - docker compose -f compose.yaml up -d +docker compose -f compose.yaml up -d ``` #### [Optional] Build docker images @@ -83,24 +83,24 @@ Below is the command to launch services This is only needed if the Docker image is unavailable or the pull operation fails. ```bash - cd $WORKDIR/GenAIExamples/FinanceAgent/docker_image_build - # get GenAIComps repo - git clone https://github.com/opea-project/GenAIComps.git - # build the images - docker compose -f build.yaml build --no-cache +cd $WORKDIR/GenAIExamples/FinanceAgent/docker_image_build +# get GenAIComps repo +git clone https://github.com/opea-project/GenAIComps.git +# build the images +docker compose -f build.yaml build --no-cache ``` If deploy on Gaudi, also need to build vllm image. ```bash - cd $WORKDIR - git clone https://github.com/HabanaAI/vllm-fork.git - # get the latest release tag of vllm gaudi - cd vllm-fork - VLLM_VER=$(git describe --tags "$(git rev-list --tags --max-count=1)") - echo "Check out vLLM tag ${VLLM_VER}" - git checkout ${VLLM_VER} - docker build --no-cache -f Dockerfile.hpu -t opea/vllm-gaudi:latest --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy +cd $WORKDIR +git clone https://github.com/HabanaAI/vllm-fork.git +# get the latest release tag of vllm gaudi +cd vllm-fork +VLLM_VER=$(git describe --tags "$(git rev-list --tags --max-count=1)") +echo "Check out vLLM tag ${VLLM_VER}" +git checkout ${VLLM_VER} +docker build --no-cache -f Dockerfile.hpu -t opea/vllm-gaudi:latest --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy ``` @@ -108,9 +108,10 @@ If deploy on Gaudi, also need to build vllm image. Wait several minutes for models to download and services to initialize (Gaudi initialization can take time). Check container logs (docker compose logs -f , especially vllm-gaudi-server). ```bash - docker logs --tail 2000 -f vllm-gaudi-server +docker logs --tail 2000 -f vllm-gaudi-server ``` -> Expected output of the `vllm-gaudi-server` service is + +> Below is the expected output of the `vllm-gaudi-server` service. ``` INFO: Started server process [1] INFO: Waiting for application startup. @@ -124,8 +125,8 @@ Wait several minutes for models to download and services to initialize (Gaudi in Ingest data and retrieval from database ```bash - python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option ingest - python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option get +python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option ingest +python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option get ``` ### Validate Agents @@ -133,30 +134,30 @@ Ingest data and retrieval from database FinQA Agent: ```bash - export agent_port="9095" - prompt="What is Gap's revenue in 2024?" - python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port +export agent_port="9095" +prompt="What is Gap's revenue in 2024?" +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port ``` Research Agent: ```bash - export agent_port="9096" - prompt="generate NVDA financial research report" - python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port --tool_choice "get_current_date" --tool_choice "get_share_performance" +export agent_port="9096" +prompt="generate NVDA financial research report" +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port --tool_choice "get_current_date" --tool_choice "get_share_performance" ``` Supervisor Agent single turns: ```bash - export agent_port="9090" - python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream +export agent_port="9090" +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream ``` Supervisor Agent multi turn: ```bash - python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --multi-turn --stream +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --multi-turn --stream ``` ## Accessing the User Interface (UI) From a13f84e4d03c6d1738ddbc24600d3f6ad70cc253 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Fri, 30 May 2025 16:44:57 +0000 Subject: [PATCH 08/13] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- FinanceAgent/README.md | 17 ++++---- .../docker_compose/intel/hpu/gaudi/README.md | 39 ++++++++++++------- .../intel/hpu/gaudi/compose.yaml | 4 +- FinanceAgent/docker_compose/intel/set_env.sh | 5 +-- FinanceAgent/tests/test_compose_on_gaudi.sh | 2 +- 5 files changed, 36 insertions(+), 31 deletions(-) diff --git a/FinanceAgent/README.md b/FinanceAgent/README.md index ed28b1df65..640f7113d0 100644 --- a/FinanceAgent/README.md +++ b/FinanceAgent/README.md @@ -1,4 +1,4 @@ -# Finance Agent Example +# Finance Agent Example ## Table of Contents @@ -10,7 +10,6 @@ - [Deployment Options](#deployment-options) - [Contribution](#contribution) - ## Overview The Finance Agent exemplifies a hierarchical multi-agent system designed to streamline financial document processing and analysis for users. It offers three core functionalities: summarizing lengthy financial documents, answering queries related to these documents, and conducting research to generate investment reports on public companies. @@ -19,8 +18,8 @@ Navigating and analyzing extensive financial documents can be both challenging a Users interact with the system through a graphical user interface (UI), where a supervisor agent manages requests by delegating tasks to worker agents or the summarization microservice. The system also supports document uploads via the UI for processing. - ## Architecture + ### High-Level Diagram The architecture of this Finance Agent example is shown in the figure below. The agent is a hierarchical multi-agent system and has 3 main functions: @@ -33,7 +32,8 @@ The user interacts with the supervisor agent through the graphical UI. The super ![Finance Agent Architecture](assets/finance_agent_arch.png) -### OPEA Microservices Diagram for Data Handling +### OPEA Microservices Diagram for Data Handling + The architectural diagram of the `dataprep` microservice is shown below. We use [docling](https://github.com/docling-project/docling) to extract text from PDFs and URLs into markdown format. Both the full document content and tables are extracted. We then use an LLM to extract metadata from the document, including the company name, year, quarter, document type, and document title. The full document markdown then gets chunked, and LLM is used to summarize each chunk, and the summaries are embedded and saved to a vector database. Each table is also summarized by LLM and the summaries are embedded and saved to the vector database. The chunks and tables are also saved into a KV store. The pipeline is designed as such to improve retrieval accuracy of the `search_knowledge_base` tool used by the Question Answering worker agent. ![dataprep architecture](assets/fin_agent_dataprep.png) @@ -52,17 +52,16 @@ The Question Answering worker agent uses `search_knowledge_base` tool to get rel ![finqa search tool arch](assets/finqa_tool.png) - ## Deployment Options + This Finance Agent example can be deployed manually on Docker Compose. -| Hardware | Deployment Mode | Guide Link | -| :-------------- | :------------------- | :----------------------------------------------------------------------- | +| Hardware | Deployment Mode | Guide Link | +| :----------------------------- | :------------------- | :----------------------------------------------------------------------- | | Intel® Gaudi® AI Accelerator | Single Node (Docker) | [Gaudi Docker Compose Guide](./docker_compose/intel/hpu/gaudi/README.md) | _Note: Building custom microservice images can be done using the resources in [GenAIComps](https://github.com/opea-project/GenAIComps)._ - ## Contribution -We welcome contributions to the OPEA project. Please refer to the [contribution guidelines](https://github.com/opea-project/docs/blob/main/community/CONTRIBUTING.md) for more information. +We welcome contributions to the OPEA project. Please refer to the [contribution guidelines](https://github.com/opea-project/docs/blob/main/community/CONTRIBUTING.md) for more information. diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md index 592185475b..79f0a9dec9 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md @@ -1,4 +1,5 @@ # Deploy Finance Agent on Intel® Gaudi® AI Accelerator with Docker Compose + This README provides instructions for deploying the Finance Agent application using Docker Compose on systems equipped with Intel® Gaudi® AI Accelerators. ## Table of Contents @@ -14,6 +15,7 @@ This README provides instructions for deploying the Finance Agent application us This guide focuses on running the pre-configured Finance Agent service using Docker Compose on Intel® Gaudi® AI Accelerators. It leverages containers optimized for Gaudi for the LLM serving component, along with CPU-based containers for other microservices like embedding, retrieval, data preparation and the UI. ## Prerequisites + - Docker and Docker Compose installed. - Intel® Gaudi® AI Accelerator(s) with the necessary drivers and software stack installed on the host system. (Refer to Intel Gaudi Documentation). - Git installed (for cloning repository). @@ -33,8 +35,11 @@ cd GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi ``` ## Start Deployment + This uses the default vLLM-based deployment profile (vllm-gaudi-server). + ### Configure Environment + Set required environment variables in your shell: ```shell @@ -42,11 +47,11 @@ Set required environment variables in your shell: export HF_CACHE_DIR="./data" # Some models from Hugging Face require approval beforehand. Ensure you have the necessary permissions to access them. export HF_TOKEN="your_huggingface_token" -export FINNHUB_API_KEY="your-finnhub-api-key" -export FINANCIAL_DATASETS_API_KEY="your-financial-datgasets-api-key" +export FINNHUB_API_KEY="your-finnhub-api-key" +export FINANCIAL_DATASETS_API_KEY="your-financial-datgasets-api-key" # Optional: Configure HOST_IP if needed -# Replace with your host's external IP address (do not use localhost or 127.0.0.1). +# Replace with your host's external IP address (do not use localhost or 127.0.0.1). # export HOST_IP=$(hostname -I | awk '{print $1}') # Optional: Configure proxy if needed @@ -60,19 +65,21 @@ source ../../set_env.sh Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (LLM_SERVICE_PORT, TEI_EMBEDDER_PORT, etc.) are set if not using defaults from the compose file. For instance, edit the set_env.sh to change the LLM model: ### Start Services + #### Deploy with Docker Compose + Below is the command to launch services - - vllm-gaudi-server - - tei-embedding-serving - - redis-vector-db - - redis-kv-store - - dataprep-redis-server-finance - - finqa-agent-endpoint - - research-agent-endpoint - - docsum-vllm-gaudi - - supervisor-agent-endpoint - - agent-ui +- vllm-gaudi-server +- tei-embedding-serving +- redis-vector-db +- redis-kv-store +- dataprep-redis-server-finance +- finqa-agent-endpoint +- research-agent-endpoint +- docsum-vllm-gaudi +- supervisor-agent-endpoint +- agent-ui ```shell docker compose -f compose.yaml up -d @@ -103,15 +110,16 @@ git checkout ${VLLM_VER} docker build --no-cache -f Dockerfile.hpu -t opea/vllm-gaudi:latest --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy ``` - ## Validate Services + Wait several minutes for models to download and services to initialize (Gaudi initialization can take time). Check container logs (docker compose logs -f , especially vllm-gaudi-server). ```bash docker logs --tail 2000 -f vllm-gaudi-server -``` +``` > Below is the expected output of the `vllm-gaudi-server` service. + ``` INFO: Started server process [1] INFO: Waiting for application startup. @@ -122,6 +130,7 @@ docker logs --tail 2000 -f vllm-gaudi-server ``` ### Validate Data Services + Ingest data and retrieval from database ```bash diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml index 1ecaecf77c..e788c5899a 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml @@ -118,7 +118,7 @@ services: REDIS_URL_KV: ${REDIS_URL_KV} TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} LLM_ENDPOINT: ${LLM_ENDPOINT} - LLM_MODEL: ${LLM_MODEL_ID} + LLM_MODEL: ${LLM_MODEL_ID} HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN} HF_TOKEN: ${HF_TOKEN} LOGFLAG: true @@ -215,5 +215,3 @@ services: ports: - "5175:8080" ipc: host - - diff --git a/FinanceAgent/docker_compose/intel/set_env.sh b/FinanceAgent/docker_compose/intel/set_env.sh index 571efec0c2..16893f3ab5 100644 --- a/FinanceAgent/docker_compose/intel/set_env.sh +++ b/FinanceAgent/docker_compose/intel/set_env.sh @@ -23,7 +23,7 @@ check_var "HF_TOKEN" check_var "HOST_IP" # VLLM configuration -export VLLM_PORT="${VLLM_PORT:-8086}" +export VLLM_PORT="${VLLM_PORT:-8086}" export VLLM_VOLUME="${VLLM_VOLUME:-/data2/huggingface}" export VLLM_IMAGE="${VLLM_IMAGE:-opea/vllm-gaudi:latest}" export LLM_MODEL_ID="${LLM_MODEL_ID:-meta-llama/Llama-3.3-70B-Instruct}" @@ -70,7 +70,7 @@ export FINANCIAL_DATASETS_API_KEY="${FINANCIAL_DATASETS_API_KEY}" # Toolset and prompt paths -if check_var "WORKDIR"; then +if check_var "WORKDIR"; then export TOOLSET_PATH=$WORKDIR/GenAIExamples/FinanceAgent/tools/ export PROMPT_PATH=$WORKDIR/GenAIExamples/FinanceAgent/prompts/ @@ -87,4 +87,3 @@ if check_var "WORKDIR"; then fi done fi - diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index cd7ed1a84a..2320767de2 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -17,7 +17,7 @@ export HTTPS_PROXY="${https_proxy}" # VLLM configuration MODEL=meta-llama/Llama-3.3-70B-Instruct -export VLLM_PORT="${VLLM_PORT:-8086}" +export VLLM_PORT="${VLLM_PORT:-8086}" # export HF_CACHE_DIR="${HF_CACHE_DIR:-"./data"}" export HF_CACHE_DIR=${model_cache:-"./data2/huggingface"} From 4b91d18a0f6311f18b5cfb30e42077a776d3fbc9 Mon Sep 17 00:00:00 2001 From: Mustafa Date: Sun, 1 Jun 2025 07:29:55 +0000 Subject: [PATCH 09/13] update test file Signed-off-by: Mustafa --- FinanceAgent/tests/test_compose_on_gaudi.sh | 66 +++++++-------------- 1 file changed, 21 insertions(+), 45 deletions(-) diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index 2320767de2..efe719bfb4 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -103,14 +103,16 @@ function build_vllm_docker_image() { fi } -function start_vllm_service_70B() { - echo "token is ${HF_TOKEN}" - echo "start vllm gaudi service" - echo "**************MODEL is $LLM_MODEL_ID**************" - docker run -d --runtime=habana --rm --name "vllm-gaudi-server" -e HABANA_VISIBLE_DEVICES=all -p $VLLM_PORT:8000 -v $VLLM_VOLUME:/data -e HF_TOKEN=$HF_TOKEN -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e HF_HOME=/data -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e VLLM_SKIP_WARMUP=true --cap-add=sys_nice --ipc=host $VLLM_IMAGE --model ${MODEL} --max-seq-len-to-capture 16384 --tensor-parallel-size 4 - sleep 10s - echo "Waiting vllm gaudi ready" - n=0 +function stop_llm(){ + cid=$(docker ps -aq --filter "name=vllm-gaudi-server") + echo "Stopping container $cid" + if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi + +} + +function start_all_services(){ + docker compose -f $WORKPATH/docker_compose/intel/hpu/gaudi/compose.yaml up -d + until [[ "$n" -ge 200 ]] || [[ $ready == true ]]; do docker logs vllm-gaudi-server &> ${LOG_PATH}/vllm-gaudi-service.log n=$((n+1)) @@ -127,28 +129,6 @@ function start_vllm_service_70B() { echo "Service started successfully" } -function stop_llm(){ - cid=$(docker ps -aq --filter "name=vllm-gaudi-server") - echo "Stopping container $cid" - if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi - -} - -function start_dataprep_and_agent(){ - docker compose -f $WORKPATH/docker_compose/intel/hpu/gaudi/compose.yaml up -d \ - tei-embedding-serving \ - redis-vector-db \ - redis-kv-store \ - dataprep-redis-finance \ - worker-finqa-agent \ - worker-research-agent \ - docsum-vllm-gaudi \ - supervisor-react-agent \ - agent-ui - - sleep 1m -} - function validate() { local CONTENT="$1" local EXPECTED_RESULT="$2" @@ -196,7 +176,7 @@ function stop_dataprep() { } function validate_agent_service() { - # # test worker finqa agent + # test worker finqa agent echo "======================Testing worker finqa agent======================" export agent_port="9095" prompt="What is Gap's revenue in 2024?" @@ -210,7 +190,7 @@ function validate_agent_service() { exit 1 fi - # # test worker research agent + # test worker research agent echo "======================Testing worker research agent======================" export agent_port="9096" prompt="Johnson & Johnson" @@ -277,27 +257,23 @@ build_dataprep_agent_images echo "=================== #1 Building docker images completed====================" -echo "=================== #2 Start vllm endpoint====================" -start_vllm_service_70B -echo "=================== #2 vllm endpoint started====================" - -echo "=================== #3 Start data and agent services ====================" -start_dataprep_and_agent -echo "=================== #3 data and agent endpoint started====================" +echo "=================== #2 Start services ====================" +start_all_services +echo "=================== #2 Endpoints for services started====================" -echo "=================== #4 Validate ingest_validate_dataprep ====================" +echo "=================== #3 Validate ingest_validate_dataprep ====================" ingest_validate_dataprep -echo "=================== #4 Data ingestion and validation completed====================" +echo "=================== #3 Data ingestion and validation completed====================" -echo "=================== #5 Start agents ====================" +echo "=================== #4 Start agents ====================" validate_agent_service -echo "=================== #5 Agent test passed ====================" +echo "=================== #4 Agent test passed ====================" -echo "=================== #6 Stop microservices ====================" +echo "=================== #5 Stop microservices ====================" stop_agent_docker stop_dataprep stop_llm -echo "=================== #6 Microservices stopped====================" +echo "=================== #5 Microservices stopped====================" echo y | docker system prune From 6a5e7adadaa228fb45a70aba883dc00d2cfa998a Mon Sep 17 00:00:00 2001 From: Mustafa Date: Mon, 2 Jun 2025 18:41:59 +0000 Subject: [PATCH 10/13] update test file Signed-off-by: Mustafa --- FinanceAgent/tests/test_compose_on_gaudi.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index efe719bfb4..56e149f43a 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -80,7 +80,7 @@ function build_dataprep_agent_images() { function build_agent_image_local(){ cd $WORKDIR/GenAIComps/ - docker build -t opea/agent:latest -f comps/agent/src/Dockerfile . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy + docker build -t opea/agent:latest -f comps/agent/src/Dockerfile . --build-arg https_proxy=$HTTPS_PROXY --build-arg http_proxy=$HTTP_PROXY } function build_vllm_docker_image() { @@ -94,7 +94,7 @@ function build_vllm_docker_image() { VLLM_FORK_VER=v0.6.6.post1+Gaudi-1.20.0 git checkout ${VLLM_FORK_VER} &> /dev/null - docker build --no-cache -f Dockerfile.hpu -t $VLLM_IMAGE --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy + docker build --no-cache -f Dockerfile.hpu -t $VLLM_IMAGE --shm-size=128g . --build-arg https_proxy=$HTTPS_PROXY --build-arg http_proxy=$HTTP_PROXY if [ $? -ne 0 ]; then echo "$VLLM_IMAGE failed" exit 1 From 791e2c557db21ca43fb475a3a2d12d8f8aae6a97 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Mon, 2 Jun 2025 18:42:44 +0000 Subject: [PATCH 11/13] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- FinanceAgent/tests/test_compose_on_gaudi.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index 56e149f43a..802712d210 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -111,7 +111,7 @@ function stop_llm(){ } function start_all_services(){ - docker compose -f $WORKPATH/docker_compose/intel/hpu/gaudi/compose.yaml up -d + docker compose -f $WORKPATH/docker_compose/intel/hpu/gaudi/compose.yaml up -d until [[ "$n" -ge 200 ]] || [[ $ready == true ]]; do docker logs vllm-gaudi-server &> ${LOG_PATH}/vllm-gaudi-service.log From ea0f0b652ce313a080c90f3fa9af8ae0682a70b0 Mon Sep 17 00:00:00 2001 From: Mustafa Date: Mon, 2 Jun 2025 19:52:10 +0000 Subject: [PATCH 12/13] update test file Signed-off-by: Mustafa --- FinanceAgent/tests/test_compose_on_gaudi.sh | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index 802712d210..3fe2860600 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -15,6 +15,10 @@ export NO_PROXY="${NO_PROXY},${HOST_IP}" export HTTP_PROXY="${http_proxy}" export HTTPS_PROXY="${https_proxy}" +export no_proxy="${no_proxy},${HOST_IP}" +export http_proxy="${http_proxy}" +export https_proxy="${https_proxy}" + # VLLM configuration MODEL=meta-llama/Llama-3.3-70B-Instruct export VLLM_PORT="${VLLM_PORT:-8086}" @@ -227,7 +231,6 @@ function validate_agent_service() { docker logs supervisor-agent-endpoint exit 1 fi - } function stop_agent_docker() { @@ -254,7 +257,6 @@ build_dataprep_agent_images # ## for local test # # build_agent_image_local - echo "=================== #1 Building docker images completed====================" echo "=================== #2 Start services ====================" From e5a59e5f4f7060af8dda3149162cce43cef4dd69 Mon Sep 17 00:00:00 2001 From: Mustafa Date: Mon, 2 Jun 2025 21:14:36 +0000 Subject: [PATCH 13/13] update test file Signed-off-by: Mustafa --- FinanceAgent/tests/test_compose_on_gaudi.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index 3fe2860600..d534ffa122 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -252,7 +252,7 @@ stop_dataprep cd $WORKPATH/tests echo "=================== #1 Building docker images====================" -build_vllm_docker_image +# build_vllm_docker_image build_dataprep_agent_images # ## for local test