this post was submitted on 17 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Yi-34B-200K is the base model I'm using. Specifically the Capybara/Tess tunes.
I can squeeze 63K context on it at 3.5bpw. Its actually surprisingly good at continuing a full context story, referencing details throughout and such.
Anyway I am on linux, so no gpu swap like windows. I am indeed using it in a chat/novel style chat, so the context does scroll and get cached in ooba.