this post was submitted on 10 Nov 2023
1 points (100.0% liked)

LocalLLaMA

13 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago
MODERATORS
 

Hello!

By popular demand I am planning a fine-tune of https://huggingface.co/dreamgen/opus-v0-7b on top of Yi-34B and wonder whether to use the 200K as the base.

The regular Yi-34B seems slightly better than Yi-34B-200K on standard benchmarks, but I wonder how it "feels" and whether the loss of performance on short context is worth it, given that the regular version can be used up to 32K tokens.

(Yi-34B vs Yi-34B-200K)

Did anyone try an analysis of these 2 models on various sequence lengths (<4K, <8K, <16K, etc.)?

you are viewing a single comment's thread
view the rest of the comments
[–] m98789@alien.top 1 points 2 years ago (1 children)

Yi is not trustable on standard benchmarks because they are easy to game by including them in training data and the LKF gang who built this has a high pressure to justify their 1 billion dollar valuation and continue to milk investors.

The only way to really evaluate this is on some hidden benchmark never seen before and / or rigorous qualitative experiments.

Until then, I’m not holding my breath.

[–] wind_dude@alien.top 1 points 2 years ago

I believe they said they’re going to release training data. We’ll see. That’s about the only way to easily verify what made it in.