hoppyJonas

joined 1 year ago

[R] ConvNets Match Vision Transformers at Scale in c/machinelearning@academy.garden

[–] hoppyJonas@alien.top 1 points 11 months ago

It's probably both. In the Chinchilla paper, they showed that for compute-optimal training, the model size and the training dataset size should be proportional.

permalink
fedilink
source
context