The book “Hands-On Machine Learning with Scikit-Learn, Keras & Tensorflow” is somewhat of a one stop shop. I love that book. Lots of colorful pictures. 😍. But still “dee”. I think the code is also online. So you don’t have to type it out.
Altruistic-Skill8667
Have you checked on Kaggle? They have a ton of datasets.
When I hear: not a lot of data, i immediately think: overfitting danger.
If it’s a Reinforcement Learning algorithm, then maybe pretraining with synthetic data that’s similar to the real one, so that you already have some rough Q values. And then k-fold crossvalidation. Train on a subset and test on another, and then rotate through.
If you need extrapolation: Maybe try a symbolic regressor like gplearn. It tries out different combinations of functions based an a genetic algorithm from simple to complex. You can also set the allowed functions. I have never tried it though.
Or maybe a smoothing spline. Those can also extrapolate. Maybe LQSUnivariateSpline from scipy. There you can set the anchor points, which would probably allow you to get a better fit with less parameters (The fewer, the better it extrapolates).
You should consider yourself blessed the way your life went. Quant finance is maybe the most prestigious thing you can do in the US or any country. The job is interesting and rewarding. In a few years you will be looking DOWN on those machine learning snobs at Google. Now Google wants to be seen for their AI accomplishments but don’t even want to hire suitable specialists like you. Only the top of the crop. What a miserable outlook.
Later, once you made it into a quant trader role at a hedge fund or prop firm, you will get 20-50% of the profits your algos generate. Which can be millions per year. Quant trader is a job where only your performance matters, which is a good thing, not stupid papers that some biased reviewers didn’t like.
Remember: the path of least resistance is always the best. Don’t swim against the stream. Bend with the wind like a Willow tree and you shall stay whole.
Sounds like a decently sized dataset.