Back
Todo
#thecompaniesapi quantize phi3.5 to 4bit to use it in our inference server; same model size but 128k context length instead of 4k, I can now process huge chunks of texts without relying on batching
See similar todos

No replies yet