this post was submitted on 17 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

We propose Tied-LoRA, a simple paradigm utilizes weight tying and selective training to further increase parameter efficiency of the Low-rank adaptation (LoRA) method. Our investigations include all feasible combinations parameter training/freezing in conjunction with weight tying to identify the optimal balance between performance and the number of trainable parameters. Through experiments covering a variety of tasks and two base language models, we provide analysis revealing trade-offs between efficiency and performance. Our experiments uncovered a particular Tied-LoRA configuration that stands out by demonstrating comparable performance across several tasks while employing only 13~% percent of parameters utilized by the standard LoRA method.

you are viewing a single comment's thread
view the rest of the comments
[–] kpodkanowicz@alien.top 1 points 1 year ago (1 children)

hmm, one of the really interesting details here - normal lora in rank 8 tested better than in rank 128 - genuine question - how is it possible? medicore data used for lora? I have done few finetunes recently and see a similar situation between rank 128 and 256

[–] WitchSayo@alien.top 1 points 1 year ago

There are tests in the original lora paper where the boost is very small once the rank is greater than 8.

https://preview.redd.it/ii53qcx8031c1.png?width=1080&format=png&auto=webp&s=821bac1232255bf791120afde7d9e9f3506a89f5