Skip to content

Settings.nodeParser assignment is invalid. #2224

@wangmaoshu

Description

@wangmaoshu

Describe the bug
The Settings instance used in the business code (node_modules/llamaindex/dist/cjs/index.cjs) is not the same as the Settings instance used in the indices (node_modules/llamaindex/indices/dist/index.cjs).

To Reproduce
Code to reproduce the behavior:

import fs from "fs";
import path from "path";
import dotenv from "dotenv";
import { TextFileReader } from "@llamaindex/readers/text";
import { SentenceSplitter, Settings, StorageContext, storageContextFromDefaults, VectorStoreIndex } from "llamaindex";
import { PGVectorStore } from "@llamaindex/postgres";
import { openai, OpenAIEmbedding } from "@llamaindex/openai";

const env = dotenv.config();

const embedModel = new OpenAIEmbedding({
  model: "text-embedding-3-small",
  apiKey: env.parsed?.API_KEY,
  baseURL: env.parsed?.BASE_URL,
});

const llm = openai({
  model: "qwen-max",
  apiKey: env.parsed?.API_KEY,
  baseURL: env.parsed?.BASE_URL,
});

const nodeParser = new SentenceSplitter({
  chunkSize: 100,
  chunkOverlap: 20,
});


Settings.nodeParser = nodeParser;
Settings.embedModel = embedModel;
Settings.llm = llm;

console.log(Settings.nodeParser.id); // 39afa973-afae-4bc5-b48c-766aec2fea88

const connectToPGVector = async (): Promise<{ pgVectorStore: PGVectorStore, ctx: StorageContext }> => {
  const pgVectorStore = new PGVectorStore({
    clientConfig: {
      connectionString: env.parsed?.PG_CONNECTION_STRING,
    },
    dimensions: 1536
  });

  const ctx = await storageContextFromDefaults({
    vectorStore: pgVectorStore
  });

  return {
    pgVectorStore,
    ctx,
  }
};

const getDocument =  async (filePath: string) => {
  const reader = new TextFileReader();
  const fileContent = fs.readFileSync(filePath, 'utf-8');
  const documents = await reader.loadDataAsContent(new TextEncoder().encode(fileContent));
  return documents;
};


const main = async () => {
  const pathFile = path.resolve(__dirname, '../data/demo/demo.txt');
  const documents = await getDocument(pathFile);

  const { pgVectorStore, ctx } = await connectToPGVector();

  await pgVectorStore.clearCollection();

  await pgVectorStore.setCollection(pathFile);

  console.log(Settings.nodeParser.id); // 39afa973-afae-4bc5-b48c-766aec2fea88

  // The ID inside is  65f91690-6a13-4ab7-8374-57a5edd9c974
  await VectorStoreIndex.fromDocuments(documents, {
    storageContext: ctx
  });
  console.log(Settings.nodeParser.id); // 39afa973-afae-4bc5-b48c-766aec2fea88

  const index = await VectorStoreIndex.fromVectorStore(pgVectorStore);

  const queryEngine = index.asQueryEngine();
  const result = await queryEngine.query({
    query: "什么是 langchain?",
  });
  console.log(result?.message.content);
};

main();

Expected behavior
Settings needs to be globally unique.

Screenshots

Image Image

Desktop (please complete the following information):

  • OS: [e.g. macOS, Linux]
  • JS Runtime / Framework / Bundler (select all applicable)
  • Node.js
  • Deno
  • Bun
  • Next.js
  • ESBuild
  • Rollup
  • Webpack
  • Turbopack
  • Vite
  • Waku
  • Edge Runtime
  • AWS Lambda
  • Cloudflare Worker
  • Others (please elaborate on this)
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions