Mostly because it's impractical, sometimes because they are lazy or it's simply not statistically significant.
If you train a very large NN it's often to expensive to do it several times. And on very large validation sets you really get significant results pretty fast, so there is not really a need for it. However, I agree that some minor permutations of the NN architecture is often just noise and groups publish it for the sake of publishing.