Skip to content

How to insert the vector embeddings values into a table? #651

@saeedesmaili

Description

@saeedesmaili

Reading your blog post on using sqlite-vec for working with embeddings in sqlite, I was trying to do something similar but I'm stuck in the step before all these. I'm not able to insert the embedding values in the table.

Here is what I have tried:

from sqlite_utils import Database
from sqlite_vec import serialize_float32

db_test = Database("test.db")
items = [
    {"id": 1, "text": "hello world"},
    {"id": 2, "text": "hello globe"},
    {"id": 3, "text": "hi world"}
]
db_test["table"].insert_all(items, pk="id")
db_test["table"].add_column("embedding", "BLOB")

embedding = [0.1, 0.2, 0.3, 0.4] ## simplification of getting the embeddings from openai or a local embedding model
db_test["table"].update({"id": 1, "embedding": np.array(item_embedding).tobytes()}, alter=True)

error:

InterfaceError                            Traceback (most recent call last)
Cell In[27], line 1
----> 1 db_test["table"].update({"id": 1, "embedding": np.array(item_embedding).tobytes()}, alter=True)

File ~/.pyenv/versions/workbench/lib/python3.10/site-packages/sqlite_utils/db.py:2798, in Table.update(self, pk_values, updates, alter, conversions)
   2796     pk_values = [pk_values]
   2797 # Soundness check that the record exists (raises error if not):
-> 2798 self.get(pk_values)
   2799 if not updates:
   2800     return self

File ~/.pyenv/versions/workbench/lib/python3.10/site-packages/sqlite_utils/db.py:1555, in Table.get(self, pk_values)
   1553 rows = self.rows_where(" and ".join(wheres), pk_values)
   1554 try:
-> 1555     row = list(rows)[0]
   1556     self.last_pk = last_pk
   1557     return row

File ~/.pyenv/versions/workbench/lib/python3.10/site-packages/sqlite_utils/db.py:1364, in Queryable.rows_where(self, where, where_args, order_by, select, limit, offset)
   1362 if offset is not None:
   1363     sql += " offset {}".format(offset)
-> 1364 cursor = self.db.execute(sql, where_args or [])
   1365 columns = [c[0] for c in cursor.description]
   1366 for row in cursor:
...
--> 533     return self.conn.execute(sql, parameters)
    534 else:
    535     return self.conn.execute(sql)

InterfaceError: Error binding parameter 0 - probably unsupported type.

Other things I tried instead of np.array(embedding).tobytes():

  • db_test["table"].update({"id": 1, "embedding": serialize_float32(embedding)}, alter=True) : same error
  • inserting the embeddings as string instead of blob: same error

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions