this post was submitted on 22 Nov 2023

1 points (100.0% liked)

LocalLLaMA

3 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago

MODERATORS

communick@poweruser.forum

Rocket 🦝 - smol model that overcomes models much larger in size (alien.top)

submitted 11 months ago by starkiller1298@alien.top to c/localllama@poweruser.forum

21 comments fedilink hide all child comments

We're proud to introduce Rocket-3B 🦝, a state-of-the-art 3 billion parameter model!

🌌 Size vs. Performance: Rocket-3B may be smaller with its 3 billion parameters, but it punches way above its weight. In head-to-head benchmarks like MT-Bench and AlpacaEval, it consistently outperforms models up to 20 times larger.

https://preview.redd.it/fxmz9sl1ls1c1.png?width=1273&format=png&auto=webp&s=63c3838cf4f01f7efcad9ec92b97c1e493111842

🔍 Benchmark Breakdown: In MT-Bench, Rocket-3B achieved an average score of 6.56, excelling in various conversation scenarios. In AlpacaEval, it notched a near 80% win rate, showcasing its ability to produce detailed and relevant responses.

https://preview.redd.it/rpgaknn3ls1c1.png?width=1280&format=png&auto=webp&s=6d2d7543f1459ceae7f96ad05ea064e8f8076517

🛠️ Training: The model is fine-tuned from Stability AI's StableLM-3B-4e1t, employing Direct Preference Optimization (DPO) for enhanced performance.

📚 Training Data: We've amalgamated multiple public datasets to ensure a comprehensive and diverse training base. This approach equips Rocket-3B with a wide-ranging understanding and response capability.

👩‍💻 Chat format: Rocket-3B follows the ChatML format.

For an in-depth look at Rocket-3B, visit Rocket-3B's HugginFace page

top 21 comments

sorted by: hot top controversial new old

[–] Mr_Finious@alien.top 1 points 11 months ago

Any details on what max context sizes are usable?

[–] bot-333@alien.top 1 points 11 months ago

I think I need to remind people of the benchmarks used, MT-Bench and AlpacaEval are terrible benchmarks.

[–] RangerRocket09@alien.top 1 points 11 months ago

As fan of the character, I approve 👍

[–] paryska99@alien.top 1 points 11 months ago

Oh wow, this seems almost too good to be true

[–] pablines@alien.top 1 points 11 months ago

Woooooooow!

[–] Sweet_Protection_163@alien.top 1 points 11 months ago

This smells like leftovers...

We've been having "pretraining on the test set" for weeks and I'm craving something else.

[–] wiesel26@alien.top 1 points 11 months ago (1 children)

I think "The Bloke" takes requests for GUFF conversions. Might want to check hugging face.

[–] Competitive_Ad_5515@alien.top 1 points 11 months ago (1 children)

!RemindMe 7 days

[–] Agroshar@alien.top 1 points 11 months ago

TheBloke/rocket-3B-GGUF · Hugging Face

[–] LienniTa@alien.top 1 points 11 months ago

📚 Training Data: We've amalgamated multiple public datasets to ensure a comprehensive and diverse training base. This approach equips Rocket-3B with a wide-ranging understanding and response capability.

We've amalgamated multiple public benchmark answers to ensure a contaminated and diverse training base.

[–] CardAnarchist@alien.top 1 points 11 months ago (1 children)

Looking forward to trying this when some GGUF's are available.

[–] CNWDI_Sigma_1@alien.top 1 points 11 months ago (1 children)

https://huggingface.co/TheBloke/rocket-3B-GGUF

[–] uti24@alien.top 1 points 11 months ago (1 children)

Seems this model has a problem and not loading.

[–] Art10001@alien.top 1 points 11 months ago

It was recently fixed then.

[–] holistic-engine@alien.top 1 points 11 months ago

Finally, I can integrate AI to my arduino project and build my own version of BB-8

[–] uti24@alien.top 1 points 11 months ago (3 children)

Tried gguf format of this model from huggingface and they just wont load.

[–] 3m84rk@alien.top 1 points 11 months ago

I tried both GGUF models currently on HF. Same result.

Curious to try this out when it's working!

[–] those2badguys@alien.top 1 points 11 months ago

Same, even the model from the bloke that was released hours ago wouldn't work :-(

[–] brobruh211@alien.top 1 points 11 months ago

The latest version of KoboldCpp v1.50.1 now loads this model properly.

[–] phaylon@alien.top 1 points 11 months ago

👩‍💻 Chat format: Rocket-3B follows the ChatML format.

From the README and the tokenizer.json it looks like it's using a textual representation of ChatML on top of StableLM's format. Just in case this trips anyone up.

[–] Creative_Bottle_3225@alien.top 1 points 11 months ago

pansophic/rocket-3B

Model Card 🤗 ↗

Might Not Work (LMStudio )