Single comment thread
See full discussion

Why would you want to run it yourself?

Unless you have a VERY good reason, I'd use Groq as it's fast and affordable. In the future you can probably run it on device when browsers and operating systems have built-in support for LLM's.

BTW, I suggest implementing it in such a way that you're not tied down to any specific API provider. There are many open source libraries out there with OpenAI compatible API endpoints (which Groq also supports). So if you use that, it will be relatively easy to switch service providers in the future.

@smitmartijn pointed this out on X just now, but also highly recommend looking at Cloudflare AI Gateway so you can have automatic fallbacks between different providers, real time logs, response caching, and a bunch of other useful things that will either improve your DX or save you money, or both

Home
Search
Messages
Notifications
More