New subject: [math-fun] Ken Thompson's compiler hack --- NN applicability?

24 Jun 2020

      Before going deep into this... What do you mean by
     "looks like it's getting stronger,"
and what do you mean by
     "helps [them] win?"
Helps *which of the two* identical players to win?
Win against whom?
How could they be changing things that outsiders
couldn't see?  What could they learn to do except
continue to seem to be playing the game?

I can imagine, say with chess or go, the NN learning
to play a fragile but impressive-looking version of
the game.  Neither side realizes they are making bad
moves, they just have a sort of superstitious view
of the game that the other side keeps reinforcing.
And the game is confusing enough that
human observers can't tell it's crazy.

But I can't see how the NN could tell whether it was
confusing outside observers, and if it can't tell
whether it's succeeding, I don't see how a
collaboration strategy could be evolutionarilly
maintained.  The longer time went on, the more likely
the fragile game would be noticed.  Or the player
itself would notice and suddenly go back to
learning the basics.

Thompson's hack works because everyone trusts the
compiler to referee its own development process.
Then Thompson hacks the referee.
After that, the hack part never learns.
So you would need a situation where people have
put an NN in charge of running its
own training process.  Even then it's harder
than Thompson's hack, but I'm going to stop
being evil now.

There is a C compiler whose compiled code has been
automatically proven to match the semantics
of its source code.  I don't know whether the
prover was compiled with that very same C
compiler.

  --Steve
...
From: Andres Valloud <ten@smallinteger.com>
Date: 6/23/20, 2:58 PM
Hi, suppose you're training a neural network via self play.  
It looks like it's getting stronger.  How do you know the 
versions that get promoted do not also encode, in themselves, 
by chance, a collaboration mechanism that helps then win?
That is, how do you know the strongest nets do not also help 
the winning side win when they play the losing side?
How do you know they are not implementing Thompson's compiler hack?

Re: [math-fun] Ken Thompson's compiler hack --- NN applicability?

Steve Witham

Andres Valloud

tags

participants (2)