Re: [math-fun] Ken Thompson's compiler hack --- NN applicability?

23 Jun 2020

      I am not "more informed", so perhaps you should discount my thoughts on the
subject.

I also don't yet understand the cheat or exploit that has you worried.
LCZero trains against itself, and that training is complete (and the
network frozen, no?) by the time it faces Stockfish. If I understand you
correctly, you are worried that LCZero is responding to a change in
Stockfish's mood, interpreting the change as a sign that Stockfish has
noticed that it is losing. LCZero then changes strategy itself to take
advantage of Stockfish's inferred weakness.

(a) What exactly is wrong with this? A strong human chess player constantly
strives to interpret their opponent's mood, both by watching the action on
the board and by watching the opponent's face and mannerisms. If Alice sees
Bob flinch, and infers that she has an advantage she has not noticed to
that point, she is likely to press, to look harder for weaknesses. Shame on
Bob for not having a better poker-face, and likewise, shame on Stockfish
for betraying its judgement so transparently in its play.

(b) How could LCZero conceivably learn this trick? It never sees Stockfish
play until its training is over and its network locked down. If part of the
training regimen were to play millions of games against Stockfish, I would
understand the problem (but again, shame on Stockfish for playing
transparently like that).

(c) There is no "trusted compiler" in which to install the feared
Thompsonian backdoor, unless you mean the LCZero "engine" (the part that
serves as a fixed interpreter for the variable neural net). In what way
could such a cheat work? I'm not seeing enough similar pieces to justify
the analogy with Thompson's exploit. (Thompson equipped his gimmicked
compiler to specially detect when it was compiling two different programs,
the login handler and the compiler itself. On all other programs it behaved
as advertised.)

On Tue, Jun 23, 2020 at 3:22 PM Andres Valloud <ten@smallinteger.com> wrote:
...
Specifically, this is in regards to Leela Chess Zero.  If you look at
many games played against Stockfish, especially those of a while ago,
you get the impression that this happens:
1.  A locked position develops, both sides make no progress.  Both
evaluations are mildly in favor of lc0.
2.  Lc0 starts shuffling, i.e. making moves that do not improve its
position (and also do not make it worse).  Stockfish does the same.
3.  But, eventually, Stockfish's evaluation shows a big advantage for
lc0, that lc0 does not yet see.  Stockfish reacts.
4.  Lc0 then perceives the reaction and plays into the new weakness,
eventually wins.
So, how do you know the lc0 nets that get promoted as strong are not
encoding, in themselves, the hints their own winning sides need to
survive the training regime?  That is, is the self play training process
optimizing for wins achieved unilaterally, or collaboratively?
Of course, this example is very specific for lc0 in that I thought this
was possible first while watching those particular games.  However, I am
interested in the general concept.
An issue I see is that, without an audit trail, finding whether an NN
did this on its own is going to be very difficult.  I'd imagine a way to
ameliorate this problem is diversity (as in biological diversity, and
for the same reasons).
I'm interested to hear more informed opinions on this.  Are there any?
On 6/23/20 12:13, Allan Wechsler wrote:
...
There certainly are concerns of this kind, which can mostly be allayed by
proper training regimen. You'd have to be a bit more explicit about the
training regimen you envisage, before specific concerns about the
contestants gaming your regimen could be addressed. In particular: how
does
a contestant determine that it is on the "losing side"?
On Tue, Jun 23, 2020 at 2:59 PM Andres Valloud <ten@smallinteger.com>
wrote:
...
Hi, suppose you're training a neural network via self play.  It looks
like it's getting stronger.  How do you know the versions that get
promoted do not also encode, in themselves, by chance, a collaboration
mechanism that helps then win?
That is, how do you know the strongest nets do not also help the winning
side win when they play the losing side?
How do you know they are not implementing Thompson's compiler hack?
Andres.
_______________________________________________
math-fun mailing list
math-fun@mailman.xmission.com
https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________
math-fun mailing list
math-fun@mailman.xmission.com
https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________
math-fun mailing list
math-fun@mailman.xmission.com
https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun