Do you have more specifics about the trend from 2-4 replicas?
If we see a trend where 2-4 replica takes each take >20s without improvement, then that might be a odd programming glitch that can't be explained by math.
If we see a gradual trend, then maybe the math is not accounting for something at lower replicas.
