So following up on 12 Jan 2025: I ain’t afraid of nothin’, I was ready to take the plunge into vector database and all the complexity that entails (using a CDC pattern to sync data between my Postgres and my new vector database etc.).

However, after speaking to Ammar, he gave me two suggestions:

  1. First, we could try shortening the vectors. While checking out the OpenAI API, I realized this was a possibility, but I did not seriously consider this possibility until he mentioned it.
  2. At query time, load the vectors and do an in-memory HNSW vector index + search. Conveniently, Ammar has written a (perf-optimized?) library for this in Golang.

Shortening vectors

The TLDR is this worked like a charm. While looking up on this (for both the theoretical basis + the accepted best practice), I could not find very much except for:

To be honest, this wasn’t very much to go on and I also did not fully understand the theory behind shortening vectors. I was sceptical: even if this successfully sped up the search, isn’t it not scalable? What if the size of the database gets even larger?

But I knew the only way to know for sure is to try, and I tested the same queries across:

At least for my use case, this was unreasonably effective:

While I am not afraid of toil and hard work, this was truly an example of the 80/20 rule. Setting up a new database with a shortened vector column was pretty easy and the results are really good. In fact, it took much longer to sync the embeddings (and exposed a few flaws in my syncing process). While I was waiting for the sync to complete, I decided to try the in-memory vector index approach.

Foray into Golang

The idea is an enticing as it sounds crazy. But the napkin math works out: