LocalLLaMA

4 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Seperate translation layer for better models? (alien.top)

submitted 2 years ago by freehuntx@alien.top to c/localllama@poweruser.forum

1 comments fedilink hide all child comments

I have the feeling alot of models include alot of data in many languages. Would it make more sense to train just on english data and have a seperate translation layer? Or do i misunderstand something?

top 1 comments

sorted by: hot top controversial new old

[–] ganzzahl@alien.top 1 points 2 years ago

By having a separate translation module, you're making the decision for the model about which parameters should be used for translation, and which for learning about the world.

With an extremely small model (one that doesn't have the capacity to even fully learn English), this would probably be reasonable. With any other size of model (100–200 million parameters and up, maybe?), it would be far, far more effective to let the model pick and choose how it allocates its parameters.

Often, this will lead to a perfect meld of translation and learning, to the point that we don't currently even know how to figure out whether a given neuron or set of neurons does one task or another. The current most likely theory (in my opinion) is that most neurons are multitask.