[ENH]: Load HNSW index without disk intermediary #5159

tanujnay112 · 2025-07-29T14:18:54Z

Description of changes

(WIP)

Improvements & Bug fixes
- Loads the HNSW index from S3 without a disk intermediary by directly passing in the memory buffer given by S3.
New functionality
- ...

Test plan

How are these changes tested?

Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Migration plan

Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?

Observability plan

What is the plan to instrument and monitor this change?

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

tanujnay112 · 2025-07-29T14:19:11Z

[ENH]: Load HNSW index without disk intermediary #5159 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

github-actions · 2025-07-29T14:19:23Z

propel-code-bot · 2025-07-29T14:22:13Z

Enable In-Memory Loading of HNSW Index from S3 (No Disk Intermediary)

This PR refactors the HNSW index loading pipeline to support direct in-memory loading of index data from S3, rather than writing and reading intermediate files to/from disk. It updates interface contracts, adds a new code path for loading buffers directly, introduces relevant changes throughout the index provider and index types, and bumps the hnswlib dependency to a specific commit supporting this feature.

Key Changes

• Added PersistentIndex::load_from_hnsw_data API (with trait and implementations) to allow direct in-memory index loading.
• Modified hnsw_provider.rs to use memory buffers from S3 instead of temporary disk files for index loading in both open() and fork() paths.
• Refactored the segment loading logic to assemble HNSW file buffers into a new hnswlib::HnswData in memory.
• Updated hnsw.rs for new index loading API support.
• Bumped 'hnswlib' dependency to a commit supporting in-memory loading (rev: fb6c1f7) and updated Cargo.toml/Cargo.lock accordingly.

Affected Areas

• rust/index/src/hnsw_provider.rs
• rust/index/src/hnsw.rs
• rust/index/src/types.rs
• Cargo.toml
• Cargo.lock

This summary was automatically generated by @propel-code-bot

blacksmith-sh · 2025-07-29T14:22:37Z

8 Jobs Failed:

PR checks / Lint

Step "Clippy" from job "Lint" is failing. The last 20 log lines are:

[...]
   Compiling quote v0.6.13
   Compiling matrixmultiply v0.3.9
   Compiling tracing-test-macro v0.2.5
    Checking rawpointer v0.2.1
    Checking tracing-test v0.2.5
    Checking tokio-test v0.4.4
    Checking approx v0.5.1
    Checking ndarray v0.16.1
   Compiling convert_case v0.6.0
   Compiling proptest-derive v0.3.0
   Compiling napi-build v2.1.6
   Compiling chromadb-js-bindings v0.1.0 (/home/runner/_work/chroma/chroma/rust/js_bindings)
   Compiling napi-derive-backend v1.0.75
   Compiling ctor v0.2.9
    Checking napi-sys v2.4.0
    Checking napi v2.16.17
   Compiling napi-derive v2.16.13
error: could not compile `chroma-index` (lib test) due to 3 previous errors
    Checking chroma v0.1.0 (/home/runner/_work/chroma/chroma/rust/chroma)
Error: Process completed with exit code 101.

PR checks / Rust tests / test-benches (blacksmith-16vcpu-ubuntu-2204, --bench get)

Step "Run benchmark" from job "Rust tests / test-benches (blacksmith-16vcpu-ubuntu-2204, --bench get)" is failing. The last 20 log lines are:

[...]
             at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.41.1/src/runtime/scheduler/multi_thread/mod.rs:87:22
  13: tokio::runtime::context::runtime::enter_runtime
             at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.41.1/src/runtime/context/runtime.rs:65:16
  14: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
             at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.41.1/src/runtime/scheduler/multi_thread/mod.rs:86:9
  15: tokio::runtime::runtime::Runtime::block_on_inner
             at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.41.1/src/runtime/runtime.rs:370:50
  16: tokio::runtime::runtime::Runtime::block_on
             at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.41.1/src/runtime/runtime.rs:340:13
  17: get::bench_get
             at ./benches/get.rs:125:25
  18: get::benches
             at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/criterion-0.5.1/src/macros.rs:71:17
  19: get::main
             at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/criterion-0.5.1/src/macros.rs:124:17
  20: core::ops::function::FnOnce::call_once
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
error: bench failed, to rerun pass `-p worker --bench get`
Error: Process completed with exit code 101.

PR checks / Python tests / test-cluster-rust-frontend (3.9, chromadb/test/property/test_embeddings.py)

Step "Test" from job "Python tests / test-cluster-rust-frontend (3.9, chromadb/test/property/test_embeddings.py)" is failing. The last 20 log lines are:

[...]
        /home/runner/_work/chroma/chroma/chromadb/test/property/test_embeddings.py:384
        /home/runner/_work/chroma/chroma/chromadb/test/property/test_embeddings.py:384
        /home/runner/_work/chroma/chroma/chromadb/test/property/test_embeddings.py:387
        /home/runner/_work/chroma/chroma/chromadb/test/property/test_embeddings.py:387
        /home/runner/_work/chroma/chroma/chromadb/test/utils/wait_for_version_increase.py:10
        /home/runner/_work/chroma/chroma/chromadb/test/utils/wait_for_version_increase.py:10
        /home/runner/_work/chroma/chroma/chromadb/test/utils/wait_for_version_increase.py:23
        /home/runner/_work/chroma/chroma/chromadb/test/utils/wait_for_version_increase.py:23
        /home/runner/_work/chroma/chroma/chromadb/test/utils/wait_for_version_increase.py:27
        /home/runner/_work/chroma/chroma/chromadb/test/utils/wait_for_version_increase.py:27
        /home/runner/_work/chroma/chroma/chromadb/test/utils/wait_for_version_increase.py:9
        /home/runner/_work/chroma/chroma/chromadb/test/utils/wait_for_version_increase.py:9


You can reproduce this example by temporarily adding @reproduce_failure('6.112.2', b'AXicXZhZiKtnGcffJ8lkMpklk0xmMpNZMlkmy0wmsyWTmcwkqUu9EFwOHkFpsaBolVbaYxVcOUfwqsUFWsEjvfDGG6UiiHdFeqxSvRFEhKKoIBRviqgXFr2w/v7Pl7M1Icn7vcuz/J/1TQghZaFvwSzkLOQtLDMO4SiEB+xZjULIJVitBr6GtmF5O7ZTO7E9O2RnyYq2H+xLFm4E7Uia2Yp1LWsNRlneBauEOONiiIXJK/P2B8zg4/OpYJ8Tm3//zaDc4sSuxWwAsZq1mTm0KYvbDKNlKGc4cWoL8BmZxDqIaFvFmpxYD+I+sEvGB7ZleZ4H7EzaKvIGy/nuA6sHe9wlQchesGnxf+YrZkeWsg4SFyzt0t5K/fPxa8Xf/kvouHr6TRgypYOz2kWYWZviYYhoWVuyHVuzMWtZJ/DV0deuv8nLmG+yp2rriJqHzj6qntqlALAdzm9KEGCtgTQn7e7JBSim7dzazAr0LOrKCBecyXFGKpUQeYxi81DtsafninbgemFlIGj78xr0c5xeYpwBOlcn3HwaSKHd53kVPknGAGRzcBtAbQsAL+G6B+DxIHmMuSygTvPOsFtUD909crYV7EOiur0uoDM2x3oHYAesNdhZxpQr4LBm28hWxahn/BY43UKuNehI+1tPRp4i3DY4N412sQiX4eqzN/+e+GxwN43dtghsAbaLCADov3E3iXyp6dAW+cxDqgiYRZTosbrGyGDftXhwZ5LNbkzd1Os7BtfF4AtjBF8BEPl14NyJwLoeAhb6n2HStAMzHxQlIcx9mAe8InK0d2qqcqDlAGuJ0gSRTZ52hA8kp5jrgU/X0VzGN7oosWMJd+h92A1d9AyY7zKeRUl81rqyGv4jv6lLzkuONDhQA8qODd8ScRe8Y4ENTPz5p4ZZdzmU5VSd8Tqi8WJDwhJxpBfBmtuv5z5XwjOMxTG8T/ns88m5ZySRjlBLhdB98fffEqOxPAKfLHN2HlMsYV0ZpQ3kBdjJejJHGipxO4HHLHtO+O6zc9nxuITCOZ5Ww0PW8J9ZTLmC9yzDvY8WJfYsTHx4+DuzPai0me3DdRbvOjBF9i5jyZl3WsvEjWImi0cqThYi++Zt5AGy7FmjbCVH7qXZb7zWmb/+pvJLEfdfQtgz2/adAQ/KuNX3EemEUFRm2uH0IQpVwTPD04Ecuy8BP01inIVVEsZLIQq4AcJgZp5iTmkLYLbw5MDpkgdEFyEP5F9+rhXsUzLRj8gfyomitgmNOQCMAlOA4XIX2nW1q4Bc8ZWGh87EE8bSuOvhrCxZgI+CsQF4grOO+RpyvPdZqP7yh+/ljBLtNFsl6hkgjRGwCfN9zyFDZkbKkIC0DLl5ZhYgoqyGqebZWhY3JrsR3AW+Ku71eyhSAKpNWA6wyoYLeuON154Iz/9k3jfM4qVeLMLbHiQSHf0ms1XoqXaE8ODXn3tEMbruafDId1Qgc2HnjmIGjGqIE/fEJQfbIjYiLBe3pXGF9YCsg2DvcpQSUd1IwSPmKd7lxmqycRl/GYP/yBP6qSNc9Siq8j4BmBg+kWHHOu4Hnnfj0FNafOJB51DZgPqQ3RXk6ngNy3mK6EyicsFWA+Jy7um+F7MqqBdZEmJN8nUDIiVYpRGrghmzHGliLdyTzxKsJgZsTwCXzIfCFHxqEFzABH3wUlCsoeklpC/Q8AgmuxAqg1cKERrs67FzhRN5TwsEhAyR4sjQFlk6QbZ9xiM21PEU4hN1ag7Qun+3WS97yLS0guxKwRl3wY37Sq6CsgGzCpo1J+7dVFj0pNAfPioAVQtPokoXZPEQXvmAoKuwGL8HcqmfQc4ZQlDpegvEthnlkP7QVa0iyT7PG15PS0gWB6oCUJwx32D2HPUvvPHJ4cp5oGm46apAqFwyh9bnntl6nD0UPGu8jxkcR95Tg/Oe6/jy4W1XyPo5wF/T80cOxH0TqdrI2pTdkdhTUfEH/hIqZc6gx+ctXP1kbUW5f8YrQYcomcEWssSFkgUcFXjToFv0iFsEQ9Xigq8V4JWKug1v3qIGZ80px379s78otRVU1J4KofWLL/xRgYvKZSK6Jnn57nhKKALJDK5XjqJ7gy8e2JqF9x4B6Sf10/DCeKn+CmwSQKdqegSUVRTbAKd7C/Ad2yVUHTxBbuNN5t7f4kiJQEuD3yFP0j/tHcw08ys8tdgl1w48TWPHgEyqSXnm9+iI5FEjmK94Gt6cdCApbxS2sZXqxRhM6dw8aFWvqvBro/DYi+0OGkq+RY+6pEuo/qvs6F94pTnGPk1SgCzb9risg8wckKkjklabNuPUssjdAcgFgFCdu/Dquu45N+H92CErfU9qkqMDPcXQNusj7796WNPcx3fQMKbUQQGQJFNoso0uLegWXeITPkKyrdN+Ysr7kTOwpiDBwY25qZzmuXQFpYLXEjW/X7z682fUBx97Fesq6wZHMI61FuVhRTfff9RUU6CYHTmvPtsvkaXizYdKzbai0MPlCPySFECvAa8/fO3V0q/iqjw7nNzi1BI7Mh7v69Dc8PSVdkvLnxNQKvjFIWaigDJqlDa5n8hXp5TzdeeQDog44N3yIFjwuRJCVjy/JZktOvkeOyimohQ8ezXZNGa6Dr4Nyk/craOnY0e35XlHebqAbegAPaFn5V/EqOYr6Ck6PF1xhL6seiidcn51GsKBbg4hyKcTFL/9nHh3PV8qdHL3xWyQHY7Ulr0llJVCdH8auph1XCH4vW3slasMQQIq+PFZpk/JaRVifkxzq8agDtbyi6KC09PY4mSu6DV+VbniIQujN54YqvleAbJ99g69FapNpLzbRA1Vi8zbt3KU6u80+fuwSYp03ZuU/0Z13pcSUN7AB7glwVFlV4171T0tTnq9P59mvFD1QX7EPrUDM4yJMPxljI7LeEvDr63K5GdApkZuXjnB42eE3Avs3MVS0+xMMNfVPcdx7nn6WsaCe4rLEwtPffO7jwXdeinOH3MZ3u/RuahI/IxPpBRNOVScA9MEva1886U7zYAcbxnPLkAzSn+v/On5j0drS0qkAuph5aqabkJuykX2K+7Pb1fGgAI6csVlKapkeTjJNVUk5xklqf/aPEWwyd1UKGnf3q1zyatB7tKG8hl+6bSa5lFXYi4GBrtg251cHv76qmRIQDyNXSp3bq81Ux5reze/4xW27Fem4yii79yrmuQwd9XkJ/R61Ny/TtXOqTpxatfvRwm/3ZA1gsdn1a+/S56YLvy/gl3PUKosee78pkT9Qcn+nhdc0SndUaJ7eP0f33vyN9+/pZ5lD036fr84gM+O3xoG3resKmg8Hlqu+SHPGHZdOr/jMY+hltR6VBMvvuAdxLmLpYSyKYcIdk2Lx6/LEFW//0YXi+jaUJn0Kxnv0XaQYxYzdJSrGTV49xh1lEUDbhhqibMrfsFhf99v5fNki/uDZ8CpShRbUw70XQfCKA9p9HJeoirtLOIdM7C+QMFTlNaFU3kBzNMolwaRsiqMU3/kx8E7n3rks3ciOQf4gr/j/0ik0EbfDeTOQHTsl/Ez/F24jhl1vJfZxx6Xjuum/rbx7kf14Cyifk+HozvumM8AJ15x72sjVckb8by6MnU77itRXE6DbxnaPf/DZxm/GcKxivr3x1opQkmZ/oBQnxNs0dr/AYtGbyw=') as a decorator on your test case
You can reproduce this example by temporarily adding @reproduce_failure('6.112.2', b'AXicXZhZiKtnGcffJ8lkMpklk0xmMpNZMlkmy0wmsyWTmcwkqUu9EFwOHkFpsaBolVbaYxVcOUfwqsUFWsEjvfDGG6UiiHdFeqxSvRFEhKKoIBRviqgXFr2w/v7Pl7M1Icn7vcuz/J/1TQghZaFvwSzkLOQtLDMO4SiEB+xZjULIJVitBr6GtmF5O7ZTO7E9O2RnyYq2H+xLFm4E7Uia2Yp1LWsNRlneBauEOONiiIXJK/P2B8zg4/OpYJ8Tm3//zaDc4sSuxWwAsZq1mTm0KYvbDKNlKGc4cWoL8BmZxDqIaFvFmpxYD+I+sEvGB7ZleZ4H7EzaKvIGy/nuA6sHe9wlQchesGnxf+YrZkeWsg4SFyzt0t5K/fPxa8Xf/kvouHr6TRgypYOz2kWYWZviYYhoWVuyHVuzMWtZJ/DV0deuv8nLmG+yp2rriJqHzj6qntqlALAdzm9KEGCtgTQn7e7JBSim7dzazAr0LOrKCBecyXFGKpUQeYxi81DtsafninbgemFlIGj78xr0c5xeYpwBOlcn3HwaSKHd53kVPknGAGRzcBtAbQsAL+G6B+DxIHmMuSygTvPOsFtUD909crYV7EOiur0uoDM2x3oHYAesNdhZxpQr4LBm28hWxahn/BY43UKuNehI+1tPRp4i3DY4N412sQiX4eqzN/+e+GxwN43dtghsAbaLCADov3E3iXyp6dAW+cxDqgiYRZTosbrGyGDftXhwZ5LNbkzd1Os7BtfF4AtjBF8BEPl14NyJwLoeAhb6n2HStAMzHxQlIcx9mAe8InK0d2qqcqDlAGuJ0gSRTZ52hA8kp5jrgU/X0VzGN7oosWMJd+h92A1d9AyY7zKeRUl81rqyGv4jv6lLzkuONDhQA8qODd8ScRe8Y4ENTPz5p4ZZdzmU5VSd8Tqi8WJDwhJxpBfBmtuv5z5XwjOMxTG8T/ns88m5ZySRjlBLhdB98fffEqOxPAKfLHN2HlMsYV0ZpQ3kBdjJejJHGipxO4HHLHtO+O6zc9nxuITCOZ5Ww0PW8J9ZTLmC9yzDvY8WJfYsTHx4+DuzPai0me3DdRbvOjBF9i5jyZl3WsvEjWImi0cqThYi++Zt5AGy7FmjbCVH7qXZb7zWmb/+pvJLEfdfQtgz2/adAQ/KuNX3EemEUFRm2uH0IQpVwTPD04Ecuy8BP01inIVVEsZLIQq4AcJgZp5iTmkLYLbw5MDpkgdEFyEP5F9+rhXsUzLRj8gfyomitgmNOQCMAlOA4XIX2nW1q4Bc8ZWGh87EE8bSuOvhrCxZgI+CsQF4grOO+RpyvPdZqP7yh+/ljBLtNFsl6hkgjRGwCfN9zyFDZkbKkIC0DLl5ZhYgoqyGqebZWhY3JrsR3AW+Ku71eyhSAKpNWA6wyoYLeuON154Iz/9k3jfM4qVeLMLbHiQSHf0ms1XoqXaE8ODXn3tEMbruafDId1Qgc2HnjmIGjGqIE/fEJQfbIjYiLBe3pXGF9YCsg2DvcpQSUd1IwSPmKd7lxmqycRl/GYP/yBP6qSNc9Siq8j4BmBg+kWHHOu4Hnnfj0FNafOJB51DZgPqQ3RXk6ngNy3mK6EyicsFWA+Jy7um+F7MqqBdZEmJN8nUDIiVYpRGrghmzHGliLdyTzxKsJgZsTwCXzIfCFHxqEFzABH3wUlCsoeklpC/Q8AgmuxAqg1cKERrs67FzhRN5TwsEhAyR4sjQFlk6QbZ9xiM21PEU4hN1ag7Qun+3WS97yLS0guxKwRl3wY37Sq6CsgGzCpo1J+7dVFj0pNAfPioAVQtPokoXZPEQXvmAoKuwGL8HcqmfQc4ZQlDpegvEthnlkP7QVa0iyT7PG15PS0gWB6oCUJwx32D2HPUvvPHJ4cp5oGm46apAqFwyh9bnntl6nD0UPGu8jxkcR95Tg/Oe6/jy4W1XyPo5wF/T80cOxH0TqdrI2pTdkdhTUfEH/hIqZc6gx+ctXP1kbUW5f8YrQYcomcEWssSFkgUcFXjToFv0iFsEQ9Xigq8V4JWKug1v3qIGZ80px379s78otRVU1J4KofWLL/xRgYvKZSK6Jnn57nhKKALJDK5XjqJ7gy8e2JqF9x4B6Sf10/DCeKn+CmwSQKdqegSUVRTbAKd7C/Ad2yVUHTxBbuNN5t7f4kiJQEuD3yFP0j/tHcw08ys8tdgl1w48TWPHgEyqSXnm9+iI5FEjmK94Gt6cdCApbxS2sZXqxRhM6dw8aFWvqvBro/DYi+0OGkq+RY+6pEuo/qvs6F94pTnGPk1SgCzb9risg8wckKkjklabNuPUssjdAcgFgFCdu/Dquu45N+H92CErfU9qkqMDPcXQNusj7796WNPcx3fQMKbUQQGQJFNoso0uLegWXeITPkKyrdN+Ysr7kTOwpiDBwY25qZzmuXQFpYLXEjW/X7z682fUBx97Fesq6wZHMI61FuVhRTfff9RUU6CYHTmvPtsvkaXizYdKzbai0MPlCPySFECvAa8/fO3V0q/iqjw7nNzi1BI7Mh7v69Dc8PSVdkvLnxNQKvjFIWaigDJqlDa5n8hXp5TzdeeQDog44N3yIFjwuRJCVjy/JZktOvkeOyimohQ8ezXZNGa6Dr4Nyk/craOnY0e35XlHebqAbegAPaFn5V/EqOYr6Ck6PF1xhL6seiidcn51GsKBbg4hyKcTFL/9nHh3PV8qdHL3xWyQHY7Ulr0llJVCdH8auph1XCH4vW3slasMQQIq+PFZpk/JaRVifkxzq8agDtbyi6KC09PY4mSu6DV+VbniIQujN54YqvleAbJ99g69FapNpLzbRA1Vi8zbt3KU6u80+fuwSYp03ZuU/0Z13pcSUN7AB7glwVFlV4171T0tTnq9P59mvFD1QX7EPrUDM4yJMPxljI7LeEvDr63K5GdApkZuXjnB42eE3Avs3MVS0+xMMNfVPcdx7nn6WsaCe4rLEwtPffO7jwXdeinOH3MZ3u/RuahI/IxPpBRNOVScA9MEva1886U7zYAcbxnPLkAzSn+v/On5j0drS0qkAuph5aqabkJuykX2K+7Pb1fGgAI6csVlKapkeTjJNVUk5xklqf/aPEWwyd1UKGnf3q1zyatB7tKG8hl+6bSa5lFXYi4GBrtg251cHv76qmRIQDyNXSp3bq81Ux5reze/4xW27Fem4yii79yrmuQwd9XkJ/R61Ny/TtXOqTpxatfvRwm/3ZA1gsdn1a+/S56YLvy/gl3PUKosee78pkT9Qcn+nhdc0SndUaJ7eP0f33vyN9+/pZ5lD036fr84gM+O3xoG3resKmg8Hlqu+SHPGHZdOr/jMY+hltR6VBMvvuAdxLmLpYSyKYcIdk2Lx6/LEFW//0YXi+jaUJn0Kxnv0XaQYxYzdJSrGTV49xh1lEUDbhhqibMrfsFhf99v5fNki/uDZ8CpShRbUw70XQfCKA9p9HJeoirtLOIdM7C+QMFTlNaFU3kBzNMolwaRsiqMU3/kx8E7n3rks3ciOQf4gr/j/0ik0EbfDeTOQHTsl/Ez/F24jhl1vJfZxx6Xjuum/rbx7kf14Cyifk+HozvumM8AJ15x72sjVckb8by6MnU77itRXE6DbxnaPf/DZxm/GcKxivr3x1opQkmZ/oBQnxNs0dr/AYtGbyw=') as a decorator on your test case
====== 3 failed, 20 passed, 1 xpassed, 116 warnings in 2727.78s (0:45:27) ======
====== 3 failed, 20 passed, 1 xpassed, 116 warnings in 2727.78s (0:45:27) ======
Error: Process completed with exit code 1.
Error: Process completed with exit code 1.

PR checks / Rust tests / Integration test ci_k8s_integration 2

Step "Run tests" from job "Rust tests / Integration test ci_k8s_integration 2" is failing. The last 20 log lines are:

[...]

    failures:

    failures:
        garbage_collector_component::tests::test_k8s_integration_ignores_forked_collections

    test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 37 filtered out; finished in 24.20s
    
  stderr ───
    thread 'garbage_collector_component::tests::test_k8s_integration_ignores_forked_collections' panicked at rust/garbage_collector/src/garbage_collector_component.rs:684:10:
    called `Result::unwrap()` on an `Err` value: "Timeout waiting for new version to be created"
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

  Cancelling due to test failure
────────────
     Summary [ 312.575s] 16/40 tests run: 15 passed (1 slow), 1 failed, 545 skipped
        FAIL [  24.228s] garbage_collector garbage_collector_component::tests::test_k8s_integration_ignores_forked_collections
warning: 24/40 tests were not run due to test failure (run with --no-fail-fast to run all tests, or run with --max-fail)
error: test run failed
Error: Process completed with exit code 100.

PR checks / Rust tests / test (blacksmith-8vcpu-ubuntu-2204)

Step "Test" from job "Rust tests / test (blacksmith-8vcpu-ubuntu-2204)" is failing. The last 20 log lines are:

[...]
      28: core::ops::function::FnOnce::call_once
                 at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/ops/function.rs:250:5
      29: core::ops::function::FnOnce::call_once
                 at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/ops/function.rs:250:5
    note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

  Cancelling due to test failure: 7 tests still running
        PASS [   6.223s] chroma-segment blockfile_metadata::test::test_simple_regex
        PASS [  10.005s] chroma-load tests::end_to_end
        PASS [   9.040s] chroma-segment blockfile_metadata::test::test_composite_regex
        PASS [  14.028s] chroma-log sqlite_log::tests::test_push_pull_logs
        PASS [  18.691s] chroma-blockstore::blockfile_writer_test tests::blockfile_writer_test
        PASS [  29.491s] chroma-frontend::test_collection test_collection_sqlite
        PASS [  40.020s] chroma-blockstore arrow::concurrency_test::tests::test_blockfile_shuttle
────────────
     Summary [  40.575s] 289/508 tests run: 288 passed, 1 failed, 77 skipped
        FAIL [   0.250s] chroma-segment distributed_spann::test::test_spann_segment_writer
warning: 219/508 tests were not run due to test failure (run with --no-fail-fast to run all tests, or run with --max-fail)
error: test run failed
Error: Process completed with exit code 100.

PR checks / Rust tests / test-long

Step "Test" from job "Rust tests / test-long" is failing. The last 20 log lines are:

[...]
      18: core::ops::function::FnOnce::call_once
                 at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/ops/function.rs:250:5
      19: core::ops::function::FnOnce::call_once
                 at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/ops/function.rs:250:5
    note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

        SLOW [> 60.000s] chroma-index spann::types::tests::test_long_running_data_integrity_parallel
        SLOW [> 60.000s] chroma-index spann::types::tests::test_long_running_data_integrity
        PASS [  60.467s] chroma-index spann::types::tests::test_long_running_data_integrity_parallel
        SLOW [>120.000s] chroma-index spann::types::tests::test_long_running_data_integrity
        SLOW [>180.000s] chroma-index spann::types::tests::test_long_running_data_integrity
        SLOW [>240.000s] chroma-index spann::types::tests::test_long_running_data_integrity
        PASS [ 240.114s] chroma-index spann::types::tests::test_long_running_data_integrity
────────────
     Summary [ 240.115s] 5 tests run: 2 passed (2 slow), 3 failed, 580 skipped
        FAIL [   3.636s] chroma-index spann::types::tests::test_long_running_data_integrity_multiple_parallel_runs
        FAIL [   3.321s] chroma-index spann::types::tests::test_long_running_data_integrity_multiple_parallel_runs_with_updates_deletes
        FAIL [   3.195s] chroma-index spann::types::tests::test_long_running_integrity_multiple_runs
error: test run failed
Error: Process completed with exit code 100.

PR checks / all-required-pr-checks-passed

Step "Decide whether the needed jobs succeeded or failed" from job "all-required-pr-checks-passed" is failing. The last 20 log lines are:

[...]
}
EOM
)"
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  GITHUB_REPO_NAME: chroma-core/chroma
  PYTHONPATH: /home/runner/_work/_actions/re-actors/alls-green/release/v1/src
# ❌ Some of the required to succeed jobs failed 😢😢😢

📝 Job statuses:
📝 python-tests → ❌ failure [required to succeed or be skipped]
📝 python-vulnerability-scan → ✓ success [required to succeed or be skipped]
📝 javascript-client-tests → ✓ success [required to succeed or be skipped]
📝 rust-tests → ❌ failure [required to succeed or be skipped]
📝 go-tests → ✓ success [required to succeed or be skipped]
📝 lint → ❌ failure [required to succeed]
📝 check-helm-version-bump → ⬜ skipped [required to succeed or be skipped]
📝 delete-helm-comment → ✓ success [required to succeed or be skipped]
Error: Process completed with exit code 1.

1 job failed running on non-Blacksmith runners.

Summary: 1 successful workflow, 1 failed workflow

✅ Check PR Title (1 job succeeded)
❌ PR checks (39 jobs succeeded, 8 jobs failed)

Last updated: 2025-07-29 15:10:29 UTC

propel-code-bot · 2025-07-29T14:24:56Z

rust/index/src/hnsw_provider.rs

+            None => match HnswIndex::load_from_hnsw_data(
+                self.fetch_hnsw_segment(&new_id, prefix_path)
+                    .await
+                    .map_err(|e| Box::new(HnswIndexProviderForkError::FileError(*e)))?,
+                &index_config,
+                ef_search,
+                new_id,
+            ) {


[CriticalError]

The logic for loading the index within the fork method appears to be incorrect. It attempts to fetch the segment from remote storage using new_id, but the index for new_id doesn't exist in storage yet. The files for source_id have just been copied to a local directory.

The previous implementation using HnswIndex::load(storage_path_str, ...) correctly loaded the index from this new local directory. Since fork is intended to create a mutable, file-backed copy of an index, it seems the original approach of loading from the local path should be restored.

propel-code-bot · 2025-07-29T14:24:57Z

rust/index/src/hnsw_provider.rs

+        let hnsw_data = self.fetch_hnsw_segment(source_id, prefix_path).await?;
+        let getters = [
+            |hnsw_data: &hnswlib::HnswData| Arc::new(Vec::from(hnsw_data.header_buffer())),
+            |hnsw_data: &hnswlib::HnswData| Arc::new(Vec::from(hnsw_data.data_level0_buffer())),
+            |hnsw_data: &hnswlib::HnswData| Arc::new(Vec::from(hnsw_data.length_buffer())),
+            |hnsw_data: &hnswlib::HnswData| Arc::new(Vec::from(hnsw_data.link_list_buffer())),
+        ];
+
+        for (file, getter) in FILES.iter().zip(getters) {
+            let file_path = index_storage_path.join(file);
+            self.copy_bytes_to_local_file(&file_path, getter(&hnsw_data))
+                .await?;
+        }
+        Ok(())
+    }


[PerformanceOptimization]

This function now fetches the entire HNSW segment into an in-memory HnswData object before writing the individual files to disk. The previous implementation streamed each file directly. For large indexes, this change could significantly increase memory usage during the fork operation. Was this change intentional? If fork still needs to write to disk, perhaps restoring the previous file-by-file download logic for this function would be more memory-efficient.

propel-code-bot · 2025-07-29T14:24:58Z

rust/index/src/hnsw_provider.rs

+            None => match HnswIndex::load_from_hnsw_data(
+                self.fetch_hnsw_segment(id, prefix_path)
+                    .await
+                    .map_err(|e| Box::new(HnswIndexProviderOpenError::FileError(*e)))?,
+                &index_config,
+                ef_search,
+                *id,
+            ) {


[PerformanceOptimization]

This change successfully loads the index from memory, which aligns with the PR's goal. However, the open function still contains calls that write the index to a temporary directory on disk (create_dir_all, load_hnsw_segment_into_directory) before this memory-based loading occurs. These disk operations now seem redundant.

To fully load without a disk intermediary and improve efficiency, you could remove the calls to create_dir_all, load_hnsw_segment_into_directory, and purge_one_id from this function.

[ENH]: Load HNSW index without disk intermediary

e9564cf

tanujnay112 marked this pull request as ready for review July 29, 2025 14:20

propel-code-bot bot reviewed Jul 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ENH]: Load HNSW index without disk intermediary #5159

[ENH]: Load HNSW index without disk intermediary #5159

Uh oh!

tanujnay112 commented Jul 29, 2025 •

edited

Loading

Uh oh!

tanujnay112 commented Jul 29, 2025

Uh oh!

github-actions bot commented Jul 29, 2025

Uh oh!

propel-code-bot bot commented Jul 29, 2025

Uh oh!

blacksmith-sh bot commented Jul 29, 2025 •

edited

Loading

Uh oh!

propel-code-bot bot Jul 29, 2025

Uh oh!

propel-code-bot bot Jul 29, 2025

Uh oh!

propel-code-bot bot Jul 29, 2025

Uh oh!

Uh oh!

[ENH]: Load HNSW index without disk intermediary #5159

Are you sure you want to change the base?

[ENH]: Load HNSW index without disk intermediary #5159

Uh oh!

Conversation

tanujnay112 commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes

Test plan

Migration plan

Observability plan

Documentation Changes

Uh oh!

tanujnay112 commented Jul 29, 2025

Uh oh!

github-actions bot commented Jul 29, 2025

Reviewer Checklist

Testing, Bugs, Errors, Logs, Documentation

System Compatibility

Quality

Uh oh!

propel-code-bot bot commented Jul 29, 2025

Uh oh!

blacksmith-sh bot commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

propel-code-bot bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tanujnay112 commented Jul 29, 2025 •

edited

Loading

blacksmith-sh bot commented Jul 29, 2025 •

edited

Loading