LocalLLaMA

14 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

what is the best 7b right now ? (alien.top)

submitted 2 years ago by GasBond@alien.top to c/localllama@poweruser.forum

40 comments fedilink hide all child comments

for coding
for generating stories, writing email, poems etc.
good overall
etc.

you are viewing a single comment's thread
view the rest of the comments

[–] Illustrious-Lake2603@alien.top 1 points 2 years ago (3 children)

For Coding, DeepSeek coder 6.7b is exceptional

[–] davew111@alien.top 1 points 2 years ago (1 children)

Is it exceptional in any language other than Python?

[–] Dry-Vermicelli-682@alien.top 1 points 2 years ago

I'd like to know how it does in Java, Go, Rust and Zig as well as can it handle SQL quite well?

[–] ModsAndAdminsEatAss@alien.top 1 points 2 years ago (2 children)

I haven't had a chance to get hands on with DeepSeek yet. How does it compare to Code Llama?

[–] danigoncalves@alien.top 1 points 2 years ago

I was actually today comparing both (codellama 7B) and man codellama just gave crap, deepseek was vey accurate.

[–] Illustrious-Lake2603@alien.top 1 points 2 years ago (1 children)

In my opinion it's amazing it's close to gpt4

[–] Dry-Vermicelli-682@alien.top 1 points 2 years ago

what hardware are you running it on? cpu/gpu, ram, etc? Trying to figure out what I need. My old gen 1 16 core threadripper with 64GB ram doesnt seem to work very well. Multiple minutes for a simple hello response. No GPU though, but looking to put a 6700XT GPU.. not sure if that GPU will help a lot or what.

[–] Sufficient-Math3178@alien.top 1 points 2 years ago (3 children)

Models requiring remote code without any explanation are shady imo

[–] Knaledge@alien.top 1 points 2 years ago (2 children)

elaborate

[–] Sufficient-Math3178@alien.top 1 points 2 years ago

AFAIK models used to be just plain code, when you load one, for example, it would do so by calling a method pickled inside the model file. Uploader could set up this method to do practically anything they want, and it doesn’t need to be obviously malicious since code runs just like a normal python script. For example, it could simply load/render a webp image that is designed to use the recent libwebp vulnerability.

They changed this a while back, so now you need to pass an argument when loading the model to allow this behavior, and this model requires it.

[–] Illustrious-Lake2603@alien.top 1 points 2 years ago

I for one just don't trust these Chinese models at all. Not saying there's anything wrong with this but it's clear it's aligned with the Chinese agenda when I try to ask it anything about Taiwan. But for coding it works good and you can run it offline

[–] Illustrious-Lake2603@alien.top 1 points 2 years ago

shady maybe, but it can code decent without depending on the internet. So theres that

[–] valdev@alien.top 1 points 2 years ago

Im a little new here, does DeepSeek coder 6.7b somehow phone home?