overview for Chaosdrifer

LocalLLaMA and translate texts in c/localllama@poweruser.forum

[–] Chaosdrifer@alien.top 1 points 2 years ago

You can try something like Claude.ai which has long context and is free to use.

You can use a python script to load the model, split the text into chunks, and ask the model to translate per chunk, then you don't need a model with 64K context window (which will take up a lot of memory and are not that common).

It also depends on the language you are trying to translate, it would be best to find models that has been trained in the original language, most models have a large english corpus, with many finetuned with chinese data, but there are specialty models for German/arabic/japanese, try google search or find on hugging face.

Running full Falcon-180B under budget constraint in c/localllama@poweruser.forum

[–] Chaosdrifer@alien.top 1 points 2 years ago

In the case of petals where any client can drop off at anytime, each client would need multiple layers for redundancy, maybe not the full weight but at least 20-30% so if someone drops off, another one can take over instantly

Running full Falcon-180B under budget constraint in c/localllama@poweruser.forum

[–] Chaosdrifer@alien.top 1 points 2 years ago (2 children)

Yes, you are right. Although I guess it can work in petals as well if each person has the full model downloaded, then the GPU can be instructed to load the next weights locally when it is done with the current one ?

Running full Falcon-180B under budget constraint in c/localllama@poweruser.forum

[–] Chaosdrifer@alien.top 1 points 2 years ago (4 children)

Isn’t that how things like petals.dev work ?

IDE extensions for code completion, chat in c/localllama@poweruser.forum

[–] Chaosdrifer@alien.top 1 points 2 years ago

https://continue.dev. It supports many LLMs.

Extract Tables from PDFs in c/localllama@poweruser.forum

[–] Chaosdrifer@alien.top 1 points 2 years ago

You might want to look into llamaIndex’s SECinsight repo. https://github.com/run-llama/sec-insightsz they do a lot of parsing on financial documents.

What are the advantages of running local LLMs? I'm interested in coding-assistant especially. in c/localllama@poweruser.forum

[–] Chaosdrifer@alien.top 1 points 2 years ago (1 children)

Cost is really the main issue. You can train a local LLM, or you can train ChatGPT as well. I wouldn’t be surprised if someone is already making a custom GPT for helping with unity of unreal engine projects.

For Privacy, company with money will use a private instance from Azure, it is like 2-3 times the cost , but your data is safe as you have a contract with MS to keep it safe and private, with large financial penalties if it isn’t.

Also, running LLM locally isn’t 0 cost, depending on the electricity price of your area. GPU consume a LOT of power. The 4090 is like 460 watts.

Why can't we just run local reinforcement learning? in c/localllama@poweruser.forum

[–] Chaosdrifer@alien.top 1 points 2 years ago (1 children)

Please look up fine tuning and LoRA, those are the method to “evolve “ a model after it is born.

Is there a technical reason that distributed LLMs don't exist? in c/localllama@poweruser.forum

[–] Chaosdrifer@alien.top 1 points 2 years ago

It does exist, but really only works when you have very high speed, low latency connections between the machine. Like infiniteband.

LLMs and local data ingestion in c/localllama@poweruser.forum

[–] Chaosdrifer@alien.top 1 points 2 years ago

If you just want to try it out, install privateGPT on your local PC/Mac, no GPU required.