I haven't tried Mistral yet, but RAG with a 7b might not give accurate info from the context you pass it; even larger models can have trouble with accurate Q/A over documents, but there are things you can do to help with that.
Why not just make API calls to GPT 3.5T instead of trying to barely run a 7b model at a snails pace for sub-par results? It's fractions of a penny for thousands of tokens.
Probably Airoboros, either the llama2 or Mistral version, you'd have to evaluate which one handled the fine-tuning better. I suspect llama2