Public leaderboards mean nothing because 99% of the finetuned models are overfitted to hell, its like nobody ever did a kaggle comp before
this post was submitted on 18 Nov 2023
1 points (100.0% liked)
Machine Learning
1 readers
1 users here now
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
founded 1 year ago
MODERATORS
I think a big obstacle is that it is so big hardly anyone is trying to fine tune it.
Well, the model is trained on refinedWeb, which is 3.5T, so a little below chinchilla optimal for 180b. Also, all the models from the falcon series seem to feel more and more undertrained,
- The 1b model was good, and is still good after several newer gens
- the 7b was capable pre llama 2
- 40b and 180b were never as good
These leaderboards are dick measuring contests for small dicks. Imagine the dynamics of that.
Falcon-180B is not good