Hmm, will have to check this stuff with the people on the rwkv discord server.
V5 is stable at context usage, and V6 is trying to get better at using the context, so we might see improvement on this
Hmm, will have to check this stuff with the people on the rwkv discord server.
V5 is stable at context usage, and V6 is trying to get better at using the context, so we might see improvement on this
Um The dataset is opensource, its all public HF datasets
Thats the point of rwkv, you could have a 10 mil contx len and it would be the same as 100 ctx len
Its trained on 100+ languages, the focus is multilingual
Also AWQ has entire engines for efficieny, look into aphrodite engine, supposably the fastest for awq
"Do I need to learn llama.cpp or C++ to deploy models using llama-cpp-python library?" No its pure python
OpenHermes 2.5 is amazing from what I've seen. it can call functions, summarize text, is extremely competitive, all the works
There are plenty of datasets, Just take the ones meant for stable diff training, rip out the prompt text, profit
Heres some high quality captions used for dalle3, etc:
https://huggingface.co/datasets/laion/dalle-3-dataset https://huggingface.co/datasets/laion/gpt4v-dataset https://huggingface.co/datasets/laion/wuerstchen-dataset https://huggingface.co/datasets/laion/220k-GPT4Vision-captions-from-LIVIS https://huggingface.co/datasets/laion/gpt4v-emotion-dataset
RWKV v5 7b, its only half trained rn, but the model surpasses Mistral on all multilingual benchmarks, cause the is meant to be multilingual.
OpenHermes 2.5 is the latest version, but the openHermes series has a history in ai models of being good, and I used it for some function calling, its really good
No its Victorian era frankenstein obvs