If your dataset is small/medium then use cross validation instead of a test set. Just make sure your folds are stratified by time to avoid data leakage.
Machine Learning
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
Other research papers in related work has two years one for training and another for testing.
Have you considered emailing the authors of these papers and asking them for their datasets?
You will likely want your split whatever data you have. Probably just use a standard train:test split, with cross validation, as said in another comment. Also, you said in a comment that this was in an RL context, but if that were the case then you’d most likely be generating the next dataset after training on what you already have, so you’d know that you have more data on the way. So, are you solving a markov decision problem here, or is this just a applied form of supervised learning?
so, are you solving a markov decision problem here
Yes. I am thinking of just using a metric to see if it made the optimal decision by the amount of value it delivers per capita.
The main flaw from my previous metric is that it had a bias towards naive algorithms because the way its calculated which leads results to be misleading from reality. Skipping turns is sometimes the optimal decision which the metric said it was bad, but reality it isn't.
When I dug closer into the data it turns out the AI was destroying the naive algorithms with this metric and the total results we were aiming for.
Dude, okay, are you actually doing research or is this troll? Or do you have a history of mental illness?
When I hear: not a lot of data, i immediately think: overfitting danger.
If it’s a Reinforcement Learning algorithm, then maybe pretraining with synthetic data that’s similar to the real one, so that you already have some rough Q values. And then k-fold crossvalidation. Train on a subset and test on another, and then rotate through.
It has about 18 thousand samples.
Sounds like a decently sized dataset.