23 Jun
2020
23 Jun
'20
12:58 p.m.
Hi, suppose you're training a neural network via self play. It looks like it's getting stronger. How do you know the versions that get promoted do not also encode, in themselves, by chance, a collaboration mechanism that helps then win? That is, how do you know the strongest nets do not also help the winning side win when they play the losing side? How do you know they are not implementing Thompson's compiler hack? Andres.