overview for Combinatorilliance

Has Anyone Successfully Utilized the Neural Networks API on Android for LLMS with EdgeTPU? in c/localllama@poweruser.forum

[–] Combinatorilliance@alien.top 1 points 11 months ago

I'm very interested in learning more as well.

Do you know how these edge tpus compare to the coral tpu? There are some people who tried it here on localllama

Structured Output with Zephyr in c/localllama@poweruser.forum

[–] Combinatorilliance@alien.top 1 points 11 months ago

There are all sorts of approaches

microsoft guidance
llama.cpp grammar constraints
someone recently made their own approach and posted it here, search for CAPPr

What are your thoughts on the future of LLMs running mobile? in c/localllama@poweruser.forum

[–] Combinatorilliance@alien.top 1 points 11 months ago

It's not going to be just chat. The LLMs are going to be integrated into everything in the OS.

Suggesting emails, finding appointments in e-mail (I believe this already exists somewhat for Apple? In any case it will be private, local and more reliable), improved search, way improved personal assistant, APIs to access the model from any app. Lots of stuff...

Nouse-Capybara-34B 200K in c/localllama@poweruser.forum

[–] Combinatorilliance@alien.top 1 points 1 year ago (3 children)

I believe these are TheBloke's GGUF quants if anyone's interested: https://huggingface.co/TheBloke/Nous-Capybara-34B-GGUF

Need a humane LLM to talk with. in c/localllama@poweruser.forum

[–] Combinatorilliance@alien.top 1 points 1 year ago (1 children)

Please break down in front of someone. You need humans here, not robots.

Phind Model beats GPT-4 at coding, with GPT-3.5 speed and 16k context in c/localllama@poweruser.forum

[–] Combinatorilliance@alien.top 1 points 1 year ago

I'm not very familiar with phind, do you mean that this model that's competitive with GPT-4 is up for downloading right here?

https://huggingface.co/Phind/Phind-CodeLlama-34B-v2

That's insane

New Microsoft codediffusion paper suggests GPT-3.5 Turbo is only 20B, good news for open source models? in c/localllama@poweruser.forum

[–] Combinatorilliance@alien.top 1 points 1 year ago

I think it's plausible. Gpt3.5 isn't ultra smart. It's very hood most of the time, but it has clear limitations.

Seeing what mistral achieved with 7b, I'm sure we can get something similar to gpt3.5 in 20b given state of the art training and data. I'm sure OpenAI is using some tricks as well that aren't released to the public.