this post was submitted on 23 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

The new chat model released by Intel is now at the top of the OpenLLM leaderboard (among the 7B models).

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

top 14 comments
sorted by: hot top controversial new old
[–] georgejrjrjr@alien.top 1 points 11 months ago (1 children)

The model seems cool and all, but the paper is better.

Intel eliminated the preference data from direct preference optimization. Preference data is expensive and collecting it is a hassle, so this is a big deal. Best of all, it looks like their no-preference DPO actually performs better.

The trick is sampling rejects from a small model. Let’s say you have a dataset of GPT-4 completions. You mark those as good (“preferred”). You prompt Llama 2 13B and mark its responses as rejects.

Tl;dr This could boost the performance of nearly every model with a minimal increase in complexity (though obviously it’s non-zero compute).

[–] cztomsik@alien.top 1 points 11 months ago

Thank you for the summary, that is actually very cool idea!

[–] No_waln@alien.top 1 points 11 months ago (1 children)

Are they releasing the weights ?

[–] aminedjeghri@alien.top 1 points 11 months ago (1 children)
[–] timtulloch11@alien.top 1 points 11 months ago (1 children)

Thanks Any idea what the difference is between Intel/neural-chat-7b-v3-1 and Intel/neural-chat-7b-v3? They have slightly different scores but so far I can't see

[–] aminedjeghri@alien.top 1 points 11 months ago (1 children)
[–] timtulloch11@alien.top 1 points 11 months ago

Lol ty for at least replying this

[–] justynasty@alien.top 1 points 11 months ago

CausalLM (14B, llamafied) model is experimenting with something similar. These are the stories they can create. In my roleplay session, neural chat put me in the community of gypsy people, and it described their culture and customs like it happens in real life, this is an impressive model from Intel.

[–] durden111111@alien.top 1 points 11 months ago (1 children)

I found it to be worse than openhermes 2.5. It just gives shorter, more robotic responses

[–] julylu@alien.top 1 points 11 months ago (1 children)

same, i found it tends to give short response.

[–] yahma@alien.top 1 points 11 months ago (1 children)

But are the short responses more correct?

[–] Shoddy_Vegetable_115@alien.top 1 points 11 months ago (2 children)

Exactly. It didn't hallucinate even once in my tests. I used RAG and it gave me perfect to-the-point answers. But I know most people want more verbose outputs it's just that it's good for factual retrieval use cases.

[–] julylu@alien.top 1 points 11 months ago

Maybe for RAG, short answer is less possible for hallucination?I will test more. thanks

[–] Intel@alien.top 1 points 11 months ago

This is a fine-tuned/instruction-tuned model. Explicit system prompts or instructions like “generate a long, detailed answer” can make the model generate longer responses. 🙂

--Kaokao, AI SW Engineer @ Intel