LocalLLaMA

4 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Finetuned llama2 deployment with vllm (alien.top)

submitted 2 years ago by mano3-1@alien.top to c/localllama@poweruser.forum

1 comments fedilink hide all child comments

Hello everyone,

I've fine-tuned llama2 using my own dataset and now I'm looking to deploy it. The adapter weights are uploaded to HF, and the base model I'm using is h2oai/h2ogpt-4096-llama2-13b-chat.

I've been exploring the vllm project, finding it quite useful initially. However, I've run into a snag with my LoRA fine-tuned model. It seems to be searching for config.json, but since I've uploaded LoRA adapters, there's no config.json available.

Am I overlooking something in my approach, or does vllm not support LoRA fine-tuned models? Any insights or guidance would be greatly appreciated.

you are viewing a single comment's thread
view the rest of the comments

[–] InterestingBasil@alien.top 1 points 2 years ago

This is my setup and have no issues. Quantization is not well supported by VLLM