this post was submitted on 18 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

I found a post from several months ago asking about this and this was recommended : lmsys/longchat-13b-16k · Hugging Face

but I wanted to check and see if there are any other recommendations? I am wanting an LLM I can run locally that can search long transcriptions of interviews, brainstorm sessions, etc. and organize it into outlines without leaving out important info.

I have an RTX 4090 24gb and 128 DDR5 ram.

top 3 comments
sorted by: hot top controversial new old
[–] laca_komputilulo@alien.top 1 points 11 months ago

Are we talking high stakes vs creative summarization here?

[–] FullOf_Bad_Ideas@alien.top 1 points 11 months ago (1 children)

Check out yi-34B 200K fine-tunes. You can load up to about 43K tokens on rtx 4090 if you use quantized version, 4.0bpw exllama v2 i believe.

[–] Wooden-Potential2226@alien.top 1 points 11 months ago

Yi-34-200k is trained for summarization and does it really well