LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Is anyone experimenting with non-instruction tuned models? (alien.top)

submitted 2 years ago by wojcech@alien.top to c/localllama@poweruser.forum

5 comments fedilink hide all child comments

My main usecase for LLMs is literally as an auto-complete, mainly via coding, so I was wondering whether anyone has played with/had any luck using the base model for use cases that are close to simply auto completing? I could imagine the instruction tuning adding a sycophancy bias in those areas

you are viewing a single comment's thread
view the rest of the comments

[–] wojcech@alien.top 1 points 2 years ago (1 children)

Just to be clear, you aren't doing fine tuning here as in gradient updates, you are using the base model + ICL?

[–] phree_radical@alien.top 1 points 2 years ago

Yep, basically like taking a few samples from a dataset and turning them into a short text "document" with an obvious pattern so the LLM will complete it

Few-shot vs fine-tuning comparison:

Pros:

converge behavior with much fewer examples
dynamic. changes to "dataset" applied without modifying model weights
no worry about whether important information is lost
can do things like average logits of single-token classification problems from multiple inferences (work around context length limitations)

Cons:

needs context length, so can't provide too many examples or too large
sometimes need "adversarial" examples to discourage repetition of text from other examples
models that are too small have worse ICL