orca-mini-3b is good at fast summarizations, but it lies a lot, so ymmv.
this post was submitted on 17 Nov 2023
1 points (100.0% liked)
LocalLLaMA
4 readers
4 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 2 years ago
MODERATORS
you can try to go down to 7b, it will be slightly faster.
Please get specific. What's "quite slow," what's "extremely quickly." Use numbers and units that include a unit of time.
What hardware are you running on? Without changing hardware your best bet is a smaller model (in terms of parameters), or a smaller quantization of a 13b model, or both.