There certainly are concerns of this kind, which can mostly be allayed by proper training regimen. You'd have to be a bit more explicit about the training regimen you envisage, before specific concerns about the contestants gaming your regimen could be addressed. In particular: how does a contestant determine that it is on the "losing side"? On Tue, Jun 23, 2020 at 2:59 PM Andres Valloud <ten@smallinteger.com> wrote:
Hi, suppose you're training a neural network via self play. It looks like it's getting stronger. How do you know the versions that get promoted do not also encode, in themselves, by chance, a collaboration mechanism that helps then win?
That is, how do you know the strongest nets do not also help the winning side win when they play the losing side?
How do you know they are not implementing Thompson's compiler hack?
Andres.
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun