overview for kpodkanowicz

Tied-Lora: Enhacing parameter efficiency of LoRA with weight tying in c/localllama@poweruser.forum

[–] kpodkanowicz@alien.top 1 points 2 years ago (1 children)

hmm, one of the really interesting details here - normal lora in rank 8 tested better than in rank 128 - genuine question - how is it possible? medicore data used for lora? I have done few finetunes recently and see a similar situation between rank 128 and 256

Just spreading awareness towards this very useful Model. in c/localllama@poweruser.forum

[–] kpodkanowicz@alien.top 1 points 2 years ago

Really nice, I had a dreamz we need to find a way to iterate over base models so every finetune is closer to sota :D

🐺🐦‍⬛ LLM Format Comparison/Benchmark: 70B GGUF vs. EXL2 (and AWQ) in c/localllama@poweruser.forum

[–] kpodkanowicz@alien.top 1 points 2 years ago

Great work as always! Regarding Exl2 its sensitive to calibration dataset - probably the one that was used is not related to your tests. I.e. you can get higher scores in HumanEval even in 3 bits that you would get in transformers 8bit. I hope that this standard will get more popular and finetuners will do their own measurement file/quants using their dataset. Never seen q2 gguf doing better than exl2 unless i mixed rope config.

Edit - for anything higher than 4.25bit i usually use 8bit head

Update from SciPhi - Introducing SciPhi-Self-RAG-Mistral-7B-32k in c/localllama@poweruser.forum

[–] kpodkanowicz@alien.top 1 points 2 years ago (1 children)

amazing!!! I bet this approach (and optionally lora routers)will be our only shot to beat gpt4 and beyond.