LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Training LLMs on less epochs (alien.top)

submitted 2 years ago by Dry_Long3157@alien.top to c/localllama@poweruser.forum

0 comments fedilink hide all child comments

I was going through a paper called MILAN which is a pre-training method to teach the model good Visual representations and one thing that struck me is the large no. of epochs we used to train models on (see image) even if we want the model to be able to generalize well. So I'm curious to know why even base models are only trained with a low epoch count.

TIA.

https://preview.redd.it/un1mdjoodx2c1.png?width=1312&format=png&auto=webp&s=2f80e328b05c3aee00a32c1e1ee8289810d8ddf0

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here