It worked again when I checked stream response. I had unchecked it and it stopped working. But It is still slow compared to terminal.
Did you use Ollama or how are you using it?
yes I used Ollama llama3 8b. Feel the speed is slow compared to using it directly from terminal
I see. BoltAI currently uses the OpenAI-compatible server from Ollama. Maybe that's why it's slower than querying the model directly.
I will do more benchmarking and maybe switch to direct connection in the future.
It worked again when I checked stream response. I had unchecked it and it stopped working. But It is still slow compared to terminal.
Did you use Ollama or how are you using it?
yes I used Ollama llama3 8b. Feel the speed is slow compared to using it directly from terminal
I see. BoltAI currently uses the OpenAI-compatible server from Ollama. Maybe that's why it's slower than querying the model directly.
I will do more benchmarking and maybe switch to direct connection in the future.