Not an exhaustive list, but here's a few suggestions:
- Get really familiar with embedding/semantic-search/RAG.
- Fine-tune a LLaMA-2 7B model using QLoRA on an A10 EC2 instance (or whatever compute you have) to do something like document classification or sentiment analysis.
- Watch this video: https://www.youtube.com/watch?v=yj-wSRJwrrc
- Read this paper: https://arxiv.org/pdf/2311.04235.pdf
Because real research is supposed to be peer reviewed, and journals offer peer review by panels of experts. Arxiv was supposed to circumvent that by allowing for review by an open group of peers, but the cycle for new research is so short nowadays that it basically means "review by twitter"