Note that 108 is a very small set. Considering transfering a pre-learned net or other means.
But to answer your question: I've had good success with DiffAug. It's quite straightforward to implement as well. ADA is another popular one that is bit more complex.