Brad12d3

joined 10 months ago
 

I'm still new to this and I thought that 128gb CPU ram would be enough to run a 70b model? I also have an RTX 4090. However, everytime I try to run lzlv_Q4_K_M.gguf in Text Generation UI, I get "connection errored out". Could there be a setting that I should tinker with?

 

I found a post from several months ago asking about this and this was recommended : lmsys/longchat-13b-16k · Hugging Face

but I wanted to check and see if there are any other recommendations? I am wanting an LLM I can run locally that can search long transcriptions of interviews, brainstorm sessions, etc. and organize it into outlines without leaving out important info.

I have an RTX 4090 24gb and 128 DDR5 ram.

 

I'm curious how using local LLMs has helped your industry and why you'd opt for a local LLM over something like an API from ChatGPT. I've been working quite a bit with Stable Diffusion and am interested in branching out into other AI like LLMs. I do use ChatGPT for various small tasks but I'm kind of trying to wrap my mind around the scope of how LLMs can help business, particularly locally run LLMs. Also, is the draw of local LLMs mainly privacy, training, etc?