Honestly, RL sounds like a decent approach here.
GANs can generate samples from the data distribution, but not estimate them.
Honestly, RL sounds like a decent approach here.