Combinatorilliance

joined 1 year ago
[–] Combinatorilliance@alien.top 1 points 11 months ago

I'm very interested in learning more as well.

Do you know how these edge tpus compare to the coral tpu? There are some people who tried it here on localllama

[–] Combinatorilliance@alien.top 1 points 11 months ago

There are all sorts of approaches

  • microsoft guidance
  • llama.cpp grammar constraints
  • someone recently made their own approach and posted it here, search for CAPPr
[–] Combinatorilliance@alien.top 1 points 11 months ago

It's not going to be just chat. The LLMs are going to be integrated into everything in the OS.

Suggesting emails, finding appointments in e-mail (I believe this already exists somewhat for Apple? In any case it will be private, local and more reliable), improved search, way improved personal assistant, APIs to access the model from any app. Lots of stuff...

[–] Combinatorilliance@alien.top 1 points 1 year ago (3 children)

I believe these are TheBloke's GGUF quants if anyone's interested: https://huggingface.co/TheBloke/Nous-Capybara-34B-GGUF

[–] Combinatorilliance@alien.top 1 points 1 year ago (1 children)

Please break down in front of someone. You need humans here, not robots.

I'm not very familiar with phind, do you mean that this model that's competitive with GPT-4 is up for downloading right here?

https://huggingface.co/Phind/Phind-CodeLlama-34B-v2

That's insane

I think it's plausible. Gpt3.5 isn't ultra smart. It's very hood most of the time, but it has clear limitations.

Seeing what mistral achieved with 7b, I'm sure we can get something similar to gpt3.5 in 20b given state of the art training and data. I'm sure OpenAI is using some tricks as well that aren't released to the public.