this post was submitted on 31 Oct 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

If I instruction tune an LLM with a dataset where each sample is randomly generated and fit into some set of prompt templates so that my dataset is effectively very large in theory, and I train the model for a certain number of steps, is that worse than just training on a dataset of a fixed size? I’d assume it is worse because the LLM won’t see each instruction example more than once most likely, so it probably can’t learn patterns from the data very well. I've trained a couple models using this approach for thousands of steps and it seems like the model hasn't really learned anything that could be applied to complicated test examples.

top 2 comments
sorted by: hot top controversial new old
[–] detailsAtEleven@alien.top 1 points 1 year ago

An infinite dataset is indistinguishable from noise.

It depends on how random it can be. Making it totally random would be very difficult. Otherwise, your "random" dataset will have some sort of repeating pattern even though certain parameters change randomly. So, your model would learn that pattern and be repetitive.