kitkatmafia

joined 1 year ago

What is the best multi-purpose model available right now? in c/localllama@poweruser.forum

[–] kitkatmafia@alien.top 1 points 11 months ago

Wizard-lm 13b

permalink
fedilink
source

Is anyone using vLLM for inference? Are there any faster inference framework for LLama based models? (alien.top)

submitted 1 year ago by kitkatmafia@alien.top to c/localllama@poweruser.forum

0 comments fedilink

Given you have a V100 gpu at your disposal - just curious what different folks here will use for inference Llama based 7b and 13b models. Also would you use fastchat along with vLLM for conversation template?