this post was submitted on 16 Nov 2023
1 points (100.0% liked)

LocalLLaMA

4 readers
4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago
MODERATORS
 

Hi, can you guys suggest me some 7b models good at reading comprehension and instruction following

top 9 comments
sorted by: hot top controversial new old
[–] gentlecucumber@alien.top 1 points 2 years ago

Probably Airoboros, either the llama2 or Mistral version, you'd have to evaluate which one handled the fine-tuning better. I suspect llama2

[–] synw_@alien.top 1 points 2 years ago (1 children)

For me Mistral 7B Instruct is now the best one for following instructions closely. However I still have to try some recent ones that seems to have a good reputation, like OpenChat, Zephyr or Synthia for instruction following and specialized tasks

[–] sergeant113@alien.top 1 points 2 years ago

I agree. My use case involves generating json outputs, and no other mistral finetune has even come close to matching the outputs of mistral-instruct. Instruct comes very close to the performance of gpt-3.5-turbo even.

[–] daHaus@alien.top 1 points 2 years ago

IMO mistral-7B-orca, jazzing it up by telling it it's a subject matter expert and not to guess seems to help too

https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca

[–] Paulonemillionand3@alien.top 1 points 2 years ago
[–] altoidsjedi@alien.top 1 points 2 years ago (1 children)

I've had good experiences with Dolphin 2.2.1, OpenOrca, Zephyr, and OpenHermes 2.5 — all at int-4 quantization.

The truth is, there is no objective best — just a bunch of really good finetunes for each foundation model / parameter size.. all which will vary in performance depending on your use case and prompts.

The best thing you can do is create some kind of test template of questions / instructions that you run each of your candidate models against prior to adopting it.

For me, i usually do a few things for any new model I'm test driving:

  1. Give it a passage of writing (technical, literary, or prose), and have it do some question-answering / chatting on the basis of the given passage.

  2. Ask it to write some python code involving numpy operations.

  3. Have it breakdown and explain a complex topic -- my go-to is the 'AdS / CFT Correspondence' in physics

  4. Assess how well it's following my system instructions of responding in "an insightful, organized, clear, and orderly manner, with aesthetically pleasing formatting using markdown syntax for text, and KaTeX syntax for math."

(Markdown and KaTex, because they are rendered correctly on the 'Chatbox' desktop application for interacting with my LLMs on my Mac. It's available on all OS's, I highly recommend if you like the ChatGPT style of UI but want a desktop app).

It's also a good idea to start with low temperatures when testing / assessing models, which makes their outputs more deterministic. That helps in understanding what their most likely base "impulses" are. Then feel free to crank the temperature up to .8 or higher to get a sense of the model's "creativity."

[–] waxbolt@alien.top 1 points 2 years ago

Treated like an interview for a new employee it's amazing. What a clear indication would be systems are artificial intelligences and entities in their own independent right.

[–] Arkonias@alien.top 1 points 2 years ago

OpenHermes 2.5 Mistral has been quite good for me. It follows the CYOA prompt that I use well.

[–] mdutAi@alien.top 1 points 2 years ago

In my opinion, the "zephyr 7b beta" model is the best among the 7b models in terms of humane behaviors and reading comprehension. The disadvantage of this model is coding and mathematics. Here is the information screen

https://cdn-uploads.huggingface.co/production/uploads/6200d0a443eb0913fa2df7cc/raxvt5ma16d7T23my34WC.png

Also, if you would like to examine the model:

https://huggingface.co/HuggingFaceH4/zephyr-7b-beta

I can run it with Langchain on a laptop with RTX2060 6gb Ram and i7 11h and 32 ram.

The secret is INT4