this post was submitted on 28 Nov 2023
1 points (100.0% liked)
LocalLLaMA
1 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 10 months ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
You need a load balancer of some sort but an A6000 would be a good start. 15-20 tps as a single user.
In vanilla form, Llama 2 may do silly stuff. Instructs, tuning, etc. will decrease the likelihood.
If you are taking something to prod, I'd advise picking up a consultant to work with you.