Specialist_Ice_5715

joined 10 months ago
[–] Specialist_Ice_5715@alien.top 1 points 9 months ago

No I didn't even know rope was a thing, I'm reading about it now... if you have any tl;dr please post it, this stuff seems pretty complicated.

I was loading the model with a llama.cpp invocation, didn't know about rope. What would change if I left the default values on?

 

How come llama2 70B is so much worse than many other code-llama 34B?

I'm not talking specifically for coding questions but the 70B seems utterly stupid.. repeats nonsense patterns, starts talking of unrelated stuff and sometimes get stuck in a loop of repeating the same word. Seems utter garbage and I downloaded the official model from the meta HF.

Has anyone experienced the same? Am I doing something wrong with the 70B model?

[–] Specialist_Ice_5715@alien.top 1 points 10 months ago

interesting.. is blip commercially usable? I read that it is, but is this correct for the weights in their entirety?

[–] Specialist_Ice_5715@alien.top 1 points 10 months ago (2 children)

You'll have to go multi-modal. The best is now fuyu but that's not commercially usable.

 

I've been meaning to create a 32B local model of code-llama2 to help me with coding questions mostly. Sort of a personal KB (phind-33B, if you have better suggestions please let me know).
I thought of using langchain + code-llama2 + chromadb. I did read around that this could be a good setup.

A question though: I mostly have long markdown documents in the form of Q&A that I can RAG later

```
question: how do you write a fibonacci C++ function recursive?
answer: here's how:

```cpp
blabla

````

```

-> would it make sense to use chromadb for that? I guess I could do better than just automatically splitting paragraphs but even indicate where one Q&A starts and where one ends, not sure if I could use chromadb or another DB with some sort of 'schema'. Thanks.