overview for CKtalon

New APU’s close to Gpu processing, but with unlimited memory? in c/localllama@poweruser.forum

[–] CKtalon@alien.top 1 points 2 years ago

Although for inferencing, memory bandwidth is the most important, FLOPS still matter. APUs are just too slow, so the bottleneck will get shifted to calculating all those matrix operations (provided there's high bandwidth designed for APUS like Apple which I doubt so)

When training an LLM how do you decide to use a 7b, 30b, 120b, etc model (assuming you can run them all)? in c/localllama@poweruser.forum

[–] CKtalon@alien.top 1 points 2 years ago (1 children)

You are probably talking about fine tuning then (pre)training a model. There are models that were trained for coding like codellama and all the variants. You could probably train on the library’s code but I doubt you will get much out of it. Perhaps the best way is to create some instruction data based on the library (either manually or synthetic) and fine tune on that.

When training an LLM how do you decide to use a 7b, 30b, 120b, etc model (assuming you can run them all)? in c/localllama@poweruser.forum

[–] CKtalon@alien.top 1 points 2 years ago (3 children)

No one has figured out the plateau yet as more data = longer training = more expensive. Currently it seems like you can keep training with more data. Companies are pretty much training on 'all of the internet' data to get the LLM 'cleverness'. Not just Shakespeare.

About deciding the size of the model, there is the Chinchilla scaling law which provides the compute optimal point given a compute budget, ie. 2T on a 7b vs 0.5T on a 13B, the former would be better (made up number). There is also the consideration of the costs of serving the model together with the training cost and the accuracy required.

How to choose 3080 20g x 8 or 4090 x 2 to use qlora to finetune 34B models? in c/localllama@poweruser.forum

[–] CKtalon@alien.top 1 points 2 years ago (2 children)

Does your motherboard even have 8 PCIe slots? You’ll be needing server boards. Even workstation boards typically support only 7 (and can’t fit them all due to size)

[D] Why are ML model outputs not tested regarding statistical significance? in c/machinelearning@academy.garden

[–] CKtalon@alien.top 1 points 2 years ago

Once you use one of the significance tests, you’ll start seeing that increased parameters don’t give a significant improvement given the increased number of parameters, but we are at a point where accuracy is more important than that.