Re: [math-fun] Deep learning vs deep positions

26 Jul 2018

      Bill,

I sat in on a presentation today by one of the creators of Minigo, a deep
learning go program that is based on the AlphaGo Zero paper.

As a logo for Minigo, the creators chose a picture of a smiling robot
happily falling off a tall ladder.  The reason for this choice is that
while Minigo can play strong games of go, it occasionally gets suckered
into chasing a long ladder that results in a loss.

For those who don't know, a ladder in the game of go is a series of moves
that repeat and propagate across the board, as each side tries to outrun
the other:  https://en.wikipedia.org/wiki/Ladder_(Go)

To correctly predict the outcome of continuing a ladder, each player must
look many moves ahead and thus it's a "deep position" similar to the chess
positions you mentioned. Although deep, ladders are simple linear sequences
with no branching and thus humans quickly learn how to predict the outcome.

After seeing the presentation, I read the original AlphaGo Zero paper and
found this (*emphasis* mine):

*"*AlphaGo Zero rapidly progressed from entirely random moves towards a
sophisticated understanding of Go concepts including fuseki (opening),
tesuji (tactics), life-and-death, ko (repeated board situations), yose
(endgame), capturing races, sente (initiative), shape, influence and
territory, all discovered from first principles.* Surprisingly, shicho
(“ladder” capture sequences that may span the whole board) – one of the
first elements of Go knowledge learned by humans – were only understood by
AlphaGo Zero much later in training. * "

This supports the argument that the AlphaGo Zero algorithm is weak at deep
positions, particularly when they haven't been thoroughly explored by the
algorithm previously.

---Gary

On Mon, Jul 16, 2018 at 10:53 AM Gary Snethen <gsnethen@gmail.com> wrote:
...
I think Alpha Zero would fare poorly in these situations unless it trained
on them directly.
The current deep learning algorithms are good at emulating intuition and
perception, but aren't good at deep analysis and executive planning, which
is what's required for board positions like these.
---Gary
On Mon, Jul 16, 2018 at 2:41 AM Bill Gosper <billgosper@gmail.com> wrote:
...
Does anybody know how Alpha Zero would fare on RKG's endgame composition
in
https://en.wikipedia.org/wiki/Richard_K._Guy ?  Put another way, how
would
RKG's intended solution fare against Alpha Zero?
Similarly, can Alpha Zero win those computer-generated "hundred move
rule" positions?
https://www.chess.com/forum/view/fun-with-chess/longest-mate-official---mate...
--rwg
--
Sent from my iPhone