flossraptor

joined 1 year ago
[–] flossraptor@alien.top 1 points 11 months ago

With a dedicated 3090 (another card for OS) a 34b 5bpw just fits and runs very fast. Like 10-20t/s. The quality is good for my application, but I'm not coding.

[–] flossraptor@alien.top 1 points 11 months ago (1 children)

CPUs don't run LLMs.

[–] flossraptor@alien.top 1 points 11 months ago (1 children)

Even if you get a 3090 with 24gb of vram, you're going to load the biggest model you can and realize it is useless for most tasks. Less than that and I don't even know what you would use it for.

[–] flossraptor@alien.top 1 points 11 months ago (1 children)

If you guys really want to make people care about using crypto for payments, implement Dai Hard on ETH.

https://medium.com/@coinop.logan/daihard-game-theory-21a456ef224e

Basically, it creates a scenario where people can trust each other for anonymous transactions because both parties are highly incentivized to act fairly. This is accomplished by having both parties make a deposit that cab be burned if things go south.

[–] flossraptor@alien.top 1 points 11 months ago

Even if you just make an alias in pfSense and block its traffic, you're fine.

 

I'm using vLLM because it's a drop in replacement for ChatGPT. If there is something else compatible with the ChatGPT API, let me know.

Problem 1: I cannot get anything over a 7B to run in vLLM. I'm sure my parameters are wrong, but I cannot find any documentation.

python3 -m vllm.entrypoints.openai.api_server --model /home/h/Mistral-7B-finetuned-orca-dpo-v2-AWQ --quantization awq --dtype auto --max-model-len 5000

Problem 2: Mistral-7B-finetuned-orca-dpo-v2-AWQ is the only one I got up and running with responses that make sense. However, there is a prompt being appended to everything I send to it:

### Human: Got any creative ideas for a 10 year old’s birthday?
### Assistant: Of course! Here are some creative ideas for a 10-year-old's birthday party: ... [It goes on quite a bit.]

Either because of that or for other reasons it is not answering very basic questions. There are several threads about this on Github, but was able to identify zero actionable information.

Problem 4: CodeLlama-13B-Python-AWQ just blasted a bunch of hastags and gobbledygook back at me. Same problem with the prompt too.

I am running this on an Ubuntu Server VM (16 cores/48gb RAM) right now so I don't take up any VRAM, but I can switch to Windows if necessary.

[–] flossraptor@alien.top 1 points 11 months ago

12.8? That's nothing.

[–] flossraptor@alien.top 1 points 11 months ago

Very few people on the planet face a threat model where memorizing a seed phrase is worthwhile. Leaving a seed phrase accessible long enough to memorize it considerably increases the likelihood of it being compromised. Locking it up and hiding it immediately greatly reduces this possibility.

You are right about it not being difficult to memorize a seed phrase. It is not hard to do many things that are unnecessary and unlikely to be useful. Only an idiot would fail to realize that such an exercise is essentially pointless, and then proceed to brag about his ability to memorize a few words, while simultaneously demonstrating his inability to reason about basic things.

[–] flossraptor@alien.top 1 points 11 months ago

The only thing Ethereum accomplished is becoming superior money. I wish we could say we had another significant use-case.

[–] flossraptor@alien.top 1 points 11 months ago (1 children)

For some people "uncensored" means it hasn't been lobotomized, but for others it means it can write porn.

[–] flossraptor@alien.top 1 points 11 months ago

Nvidia is the only game in town right now. I decided on a 3090 for the time being, with the option of adding another one later. I think in two years we will have 100x better options specifically tailored for AI.

[–] flossraptor@alien.top 1 points 1 year ago

It's notepad with extra steps.

[–] flossraptor@alien.top 1 points 1 year ago

I think it's very overkill and you're going to be sitting at like 10-15% cpu when it's done. But for projects like this saving a few bucks is not worth it. You need room to grow.

view more: next ›