this post was submitted on 24 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Interesting timing, I don't know if this exists yet or not but I was just thinking about a feature that would use like a range for context size.
The idea would be that you specify a min and a max context, say 6k and 8k and the way it would work is when you breach the 8k max, instead of just cutting it off there, it would go further forward and cut it off at 6k and then it would build on that context until it once again reached 8k and keep repeating the process after that. This would make it so that instead of reprocessing the entire context every time, it would only need to do it when the max was exceeded. I'm a programmer by trade so I'm kind of tempted to look into building this but I haven't even looked into what that requires or if the feature already exists out there somewhere.
That would be amazing. I think something like that could even be included into ooba's official extension repo.
I think koboldCPP already does this unless I'm misunderstanding, have a look at this: