this post was submitted on 16 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 1 year ago
MODERATORS
 

Often when I read ML papers the authors compare their results against a benchmark (e.g. using RMSE, accuracy, ...) and say "our results improved with our new method by X%". Nobody makes a significance test if the new method Y outperforms benchmark Z. Is there a reason why? Especially when you break your results down e.g. to the anaylsis of certain classes in object classification this seems important for me. Or do I overlook something?

you are viewing a single comment's thread
view the rest of the comments
[–] isparavanje@alien.top 1 points 1 year ago

You're right. This is likely one of the reasons why ML has a reproducibility crisis, together with other effects like data leakage. (see: https://reproducible.cs.princeton.edu/)

Sometimes, indeed, results are so different that things are obviously statistically significant, even by eye, and that is uncommon in natural sciences. Even then, however, it should be stated clearly that the researchers believe this to be the case, and some evidence should be given.