What do these tests mean for LLM? There are many values, and I see that in most cases qwen is better than gpt4. In others it is worse or much worse
Secret_Joke_2262
joined 10 months ago
Now everyone is most interested in how much better it is than 70b llama
70b Storytelling q5 k m
A friend told me that for 70b when using q4, performance drops by 10%. The larger the model, the less it suffers from weight quantization
120 thousand rubles.
I was an idiot when assembling the PC and somehow inexplicably focused on the processor when assembling the PC, and the video card is quite weak. However, after a while, I realized that this was for the better. I can use the 70B model with 1 token per second. Maybe in the future I will buy another video card so that more layers will help with data processing
3060 12 & 13600K
What tests have you tested this in?
I'm very interested in storytelling and RP