this post was submitted on 13 Nov 2023
1 points (100.0% liked)
LocalLLaMA
4 readers
4 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
By having a separate translation module, you're making the decision for the model about which parameters should be used for translation, and which for learning about the world.
With an extremely small model (one that doesn't have the capacity to even fully learn English), this would probably be reasonable. With any other size of model (100โ200 million parameters and up, maybe?), it would be far, far more effective to let the model pick and choose how it allocates its parameters.
Often, this will lead to a perfect meld of translation and learning, to the point that we don't currently even know how to figure out whether a given neuron or set of neurons does one task or another. The current most likely theory (in my opinion) is that most neurons are multitask.