WIP

Single comment thread

Greg Raileanu

It all depends on your stack. For example I'd be using my usual stack then will be generating embeddings using OpenAI text-embedding-3-small model and store in ElasticSearch 9+ .

If you'd be willing to continuously reembed data using OpenAI might become cost prohibitive so I would look to open models and self-hosting. For storage - it all depends on your stack. Pgvector, Quadrant, etc

Reply 12mo

Praneeth Pike

@praneethpike

Thanks for the response!
I find turbopuffer to be a great vector store. Using gemini-embedding-001 at the moment. It's working great. The only issue I've had to solve was to maintain an embeddingscache of the documents with timestamps for lastupdated, and SH256 of the content to prevent duplicates. Storing the SH256 of the vectors are also useful apparently, as some people have suggested.

Reply 12mo

Go to Homepage	`g` `h`
Go to Done Todos	`g` `d`
Compose a New Todo	`n`
Go to Search	`/`
Show this dialog	`?`

Keyboard Shortcuts