The next gen of models are in the 110B mark and beyond. I would say, estimate what it takes to do 250B at FP8 and FP16, then structure your purchases accordingly. Favour high bandwidth memory.
The next gen of models are in the 110B mark and beyond. I would say, estimate what it takes to do 250B at FP8 and FP16, then structure your purchases accordingly. Favour high bandwidth memory.