Based on the 200K Context Yi 34B.
LocalLLaMA
Community to discuss about Llama, the family of large language models created by Meta AI.
if it is based in yi, should it not have the yi-licence instead of mit?
Can't wait to see the benchmarks on these things.
Dang, after that 34b drought it's like suddenly stumbling onto the great lakes right now.
200K context!!
Precisely 47K fits in 24GB at 4bpw.
I have not tried 3.5, but I think it could be much more.
I believe these are TheBloke's GGUF quants if anyone's interested: https://huggingface.co/TheBloke/Nous-Capybara-34B-GGUF
Also note this important issue that affects this and all other Yi-based models:
So we can just skip BOS token on all these models?
I did the gguf-py/scripts/gguf-set-metadata.py some-yi-model.gguf tokenizer.ggml.bos_token_id 144
and it's changed the outputs a lot from yesterday.