[math-fun] Deep learning vs deep positions - mathfun@mailman.xmission.com - mailman.xmission.com

newer
[math-fun] Is zero “weakly...

[math-fun] Deep learning vs deep positions

older
[math-fun] Worm warning &...

Bill Gosper

16 Jul 2018 16 Jul '18

3:41 a.m.

Does anybody know how Alpha Zero would fare on RKG's endgame composition in https://en.wikipedia.org/wiki/Richard_K._Guy ? Put another way, how would RKG's intended solution fare against Alpha Zero? Similarly, can Alpha Zero win those computer-generated "hundred move rule" positions? https://www.chess.com/forum/view/fun-with-chess/longest-mate-official---mate... --rwg

Reply

Sign in to reply online Use email software

Show replies by date

Gary Snethen

16 Jul 16 Jul

11:54 a.m.

I think Alpha Zero would fare poorly in these situations unless it trained on them directly. The current deep learning algorithms are good at emulating intuition and perception, but aren't good at deep analysis and executive planning, which is what's required for board positions like these. ---Gary On Mon, Jul 16, 2018 at 2:41 AM Bill Gosper <billgosper@gmail.com> wrote:

Does anybody know how Alpha Zero would fare on RKG's endgame composition in https://en.wikipedia.org/wiki/Richard_K._Guy ? Put another way, how would RKG's intended solution fare against Alpha Zero? Similarly, can Alpha Zero win those computer-generated "hundred move rule" positions?

https://www.chess.com/forum/view/fun-with-chess/longest-mate-official---mate... --rwg

-- Sent from my iPhone

Reply

Sign in to reply online Use email software

Gary Snethen

26 Jul 26 Jul

5:30 p.m.

Bill, I sat in on a presentation today by one of the creators of Minigo, a deep learning go program that is based on the AlphaGo Zero paper. As a logo for Minigo, the creators chose a picture of a smiling robot happily falling off a tall ladder. The reason for this choice is that while Minigo can play strong games of go, it occasionally gets suckered into chasing a long ladder that results in a loss. For those who don't know, a ladder in the game of go is a series of moves that repeat and propagate across the board, as each side tries to outrun the other: https://en.wikipedia.org/wiki/Ladder_(Go) To correctly predict the outcome of continuing a ladder, each player must look many moves ahead and thus it's a "deep position" similar to the chess positions you mentioned. Although deep, ladders are simple linear sequences with no branching and thus humans quickly learn how to predict the outcome. After seeing the presentation, I read the original AlphaGo Zero paper and found this (*emphasis* mine): *"*AlphaGo Zero rapidly progressed from entirely random moves towards a sophisticated understanding of Go concepts including fuseki (opening), tesuji (tactics), life-and-death, ko (repeated board situations), yose (endgame), capturing races, sente (initiative), shape, influence and territory, all discovered from first principles.* Surprisingly, shicho (“ladder” capture sequences that may span the whole board) – one of the first elements of Go knowledge learned by humans – were only understood by AlphaGo Zero much later in training. * " This supports the argument that the AlphaGo Zero algorithm is weak at deep positions, particularly when they haven't been thoroughly explored by the algorithm previously. ---Gary On Mon, Jul 16, 2018 at 10:53 AM Gary Snethen <gsnethen@gmail.com> wrote:

I think Alpha Zero would fare poorly in these situations unless it trained on them directly.

The current deep learning algorithms are good at emulating intuition and perception, but aren't good at deep analysis and executive planning, which is what's required for board positions like these.

---Gary

On Mon, Jul 16, 2018 at 2:41 AM Bill Gosper <billgosper@gmail.com> wrote:

...
Does anybody know how Alpha Zero would fare on RKG's endgame composition in https://en.wikipedia.org/wiki/Richard_K._Guy ? Put another way, how would RKG's intended solution fare against Alpha Zero? Similarly, can Alpha Zero win those computer-generated "hundred move rule" positions?

https://www.chess.com/forum/view/fun-with-chess/longest-mate-official---mate... --rwg

-- Sent from my iPhone

Reply

Sign in to reply online Use email software

Allan Wechsler

6:42 p.m.

Remember that a ladder has a big footprint not only in time but in space. AG0 did well at _local_ deep reading very early; I think what threw it about ladders was their spread across the board, not the depth of the game tree. That having been said -- it's impossible to figure out what a neural net is thinking. So this is all just blowing smoke. On Thu, Jul 26, 2018 at 7:30 PM, Gary Snethen <gsnethen@gmail.com> wrote:

Bill,

I sat in on a presentation today by one of the creators of Minigo, a deep learning go program that is based on the AlphaGo Zero paper.

As a logo for Minigo, the creators chose a picture of a smiling robot happily falling off a tall ladder. The reason for this choice is that while Minigo can play strong games of go, it occasionally gets suckered into chasing a long ladder that results in a loss.

For those who don't know, a ladder in the game of go is a series of moves that repeat and propagate across the board, as each side tries to outrun the other: https://en.wikipedia.org/wiki/Ladder_(Go)

To correctly predict the outcome of continuing a ladder, each player must look many moves ahead and thus it's a "deep position" similar to the chess positions you mentioned. Although deep, ladders are simple linear sequences with no branching and thus humans quickly learn how to predict the outcome.

After seeing the presentation, I read the original AlphaGo Zero paper and found this (*emphasis* mine):

*"*AlphaGo Zero rapidly progressed from entirely random moves towards a sophisticated understanding of Go concepts including fuseki (opening), tesuji (tactics), life-and-death, ko (repeated board situations), yose (endgame), capturing races, sente (initiative), shape, influence and territory, all discovered from first principles.* Surprisingly, shicho (“ladder” capture sequences that may span the whole board) – one of the first elements of Go knowledge learned by humans – were only understood by AlphaGo Zero much later in training. * "

This supports the argument that the AlphaGo Zero algorithm is weak at deep positions, particularly when they haven't been thoroughly explored by the algorithm previously.

---Gary

On Mon, Jul 16, 2018 at 10:53 AM Gary Snethen <gsnethen@gmail.com> wrote:

...
I think Alpha Zero would fare poorly in these situations unless it trained on them directly.

The current deep learning algorithms are good at emulating intuition and perception, but aren't good at deep analysis and executive planning, which is what's required for board positions like these.

---Gary

On Mon, Jul 16, 2018 at 2:41 AM Bill Gosper <billgosper@gmail.com> wrote:

...
Does anybody know how Alpha Zero would fare on RKG's endgame composition in https://en.wikipedia.org/wiki/Richard_K._Guy ? Put another way, how would RKG's intended solution fare against Alpha Zero? Similarly, can Alpha Zero win those computer-generated "hundred move rule" positions?

https://www.chess.com/forum/view/fun-with-chess/longest- mate-official---mate-in-545 --rwg

-- Sent from my iPhone

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

Reply

Sign in to reply online Use email software

Bill Gosper

9:06 p.m.

Gary, thanks. But for a couple of minutes, I was wondering what the hell was min EYE go. There's a huge gap yet to be bridged. How can they know Alpha Go has "a sophisticated understanding of Go concepts", unless it can articulate and communicate them? It seems to me its knowledge amounts to millions of special cases, and conceivably retains a bug or two. --Bill On Thu, Jul 26, 2018 at 4:30 PM Gary Snethen <gsnethen@gmail.com> wrote:

Bill,

I sat in on a presentation today by one of the creators of Minigo, a deep learning go program that is based on the AlphaGo Zero paper.

As a logo for Minigo, the creators chose a picture of a smiling robot happily falling off a tall ladder. The reason for this choice is that while Minigo can play strong games of go, it occasionally gets suckered into chasing a long ladder that results in a loss.

For those who don't know, a ladder in the game of go is a series of moves that repeat and propagate across the board, as each side tries to outrun the other: https://en.wikipedia.org/wiki/Ladder_(Go)

To correctly predict the outcome of continuing a ladder, each player must look many moves ahead and thus it's a "deep position" similar to the chess positions you mentioned. Although deep, ladders are simple linear sequences with no branching and thus humans quickly learn how to predict the outcome.

After seeing the presentation, I read the original AlphaGo Zero paper and found this (*emphasis* mine):

*"*AlphaGo Zero rapidly progressed from entirely random moves towards a sophisticated understanding of Go concepts including fuseki (opening), tesuji (tactics), life-and-death, ko (repeated board situations), yose (endgame), capturing races, sente (initiative), shape, influence and territory, all discovered from first principles.* Surprisingly, shicho (“ladder” capture sequences that may span the whole board) – one of the first elements of Go knowledge learned by humans – were only understood by AlphaGo Zero much later in training. * "

This supports the argument that the AlphaGo Zero algorithm is weak at deep positions, particularly when they haven't been thoroughly explored by the algorithm previously.

---Gary

On Mon, Jul 16, 2018 at 10:53 AM Gary Snethen <gsnethen@gmail.com> wrote:

...
I think Alpha Zero would fare poorly in these situations unless it trained on them directly.

The current deep learning algorithms are good at emulating intuition and perception, but aren't good at deep analysis and executive planning, which is what's required for board positions like these.

---Gary

On Mon, Jul 16, 2018 at 2:41 AM Bill Gosper <billgosper@gmail.com> wrote:

...
Does anybody know how Alpha Zero would fare on RKG's endgame composition in https://en.wikipedia.org/wiki/Richard_K._Guy ? Put another way, how would RKG's intended solution fare against Alpha Zero? Similarly, can Alpha Zero win those computer-generated "hundred move rule" positions?

https://www.chess.com/forum/view/fun-with-chess/longest-mate-official---mate... --rwg

-- Sent from my iPhone

Reply

Sign in to reply online Use email software

Gary Snethen

27 Jul 27 Jul

12:31 a.m.

Bill, You are touching on one of the open challenges of deep learning. Deep learning can solve a wide variety of problems through artificial intuition, but deep learning models cannot yet show their work. They don't reason through problems as a series of logical steps. Because of this, we cannot yet rely on their results in high stakes (and highly litigious) fields like medicine, wealth management, law enforcement, etc. even if we believe they are likely to be right more often than an expert performing the same task. In the case of AlphaGo Zero, the algorithm is actually a combination of a deep neural network and a Monte Carlo tree search algorithm. For each and every actual move, the algorithm spends several seconds exploring hundreds of potential moves chains, with priorities for those moves driven by the neural network. This is similar to the way a traditional computer chess program works, except that the neural network provides intuition for which lines of play are the most important to explore. In this way, it emulates the way a grand-master chess player thinks -- intuitively selecting only those avenues that have a good probability of being fruitful and exploring those avenues in some depth. Due to its Monte Carlo tree search design, AlphaGo Zero can be made to "explain" its moves by showing the variations it explored, how it would have handled different responses and what value it would have given to the board position at each step. However, it wouldn't be able to explain why it considered a particular move better than another or why it felt one board position was somewhat better than another. ---Gary On Thu, Jul 26, 2018 at 8:05 PM Bill Gosper <billgosper@gmail.com> wrote:

Gary, thanks. But for a couple of minutes, I was wondering what the hell was min EYE go. There's a huge gap yet to be bridged. How can they know Alpha Go has "a sophisticated understanding of Go concepts", unless it can articulate and communicate them? It seems to me its knowledge amounts to millions of special cases, and conceivably retains a bug or two. --Bill

On Thu, Jul 26, 2018 at 4:30 PM Gary Snethen <gsnethen@gmail.com> wrote:

...
Bill,

I sat in on a presentation today by one of the creators of Minigo, a deep learning go program that is based on the AlphaGo Zero paper.

As a logo for Minigo, the creators chose a picture of a smiling robot happily falling off a tall ladder. The reason for this choice is that while Minigo can play strong games of go, it occasionally gets suckered into chasing a long ladder that results in a loss.

For those who don't know, a ladder in the game of go is a series of moves that repeat and propagate across the board, as each side tries to outrun the other: https://en.wikipedia.org/wiki/Ladder_(Go)

To correctly predict the outcome of continuing a ladder, each player must look many moves ahead and thus it's a "deep position" similar to the chess positions you mentioned. Although deep, ladders are simple linear sequences with no branching and thus humans quickly learn how to predict the outcome.

After seeing the presentation, I read the original AlphaGo Zero paper and found this (*emphasis* mine):

*"*AlphaGo Zero rapidly progressed from entirely random moves towards a sophisticated understanding of Go concepts including fuseki (opening), tesuji (tactics), life-and-death, ko (repeated board situations), yose (endgame), capturing races, sente (initiative), shape, influence and territory, all discovered from first principles.* Surprisingly, shicho (“ladder” capture sequences that may span the whole board) – one of the first elements of Go knowledge learned by humans – were only understood by AlphaGo Zero much later in training. * "

This supports the argument that the AlphaGo Zero algorithm is weak at deep positions, particularly when they haven't been thoroughly explored by the algorithm previously.

---Gary

On Mon, Jul 16, 2018 at 10:53 AM Gary Snethen <gsnethen@gmail.com> wrote:

...
I think Alpha Zero would fare poorly in these situations unless it trained on them directly.

The current deep learning algorithms are good at emulating intuition and perception, but aren't good at deep analysis and executive planning, which is what's required for board positions like these.

---Gary

On Mon, Jul 16, 2018 at 2:41 AM Bill Gosper <billgosper@gmail.com> wrote:

...
Does anybody know how Alpha Zero would fare on RKG's endgame composition in https://en.wikipedia.org/wiki/Richard_K._Guy ? Put another way, how would RKG's intended solution fare against Alpha Zero? Similarly, can Alpha Zero win those computer-generated "hundred move rule" positions?

https://www.chess.com/forum/view/fun-with-chess/longest-mate-official---mate... --rwg

-- Sent from my iPhone

Reply

Sign in to reply online Use email software

Brent Meeker

12:08 p.m.

One of the interesting things about AI is what it tells us about ourselves. The examples you give in which we don't trust DLs, because they can't provide a chain of logical inference to their conclusions, are not entirely convincing when you reflect that the "chain of logical inference" that cops or doctors or financial planners provide is often made up after the fact, has crucial intuitive choices in it, and has gaps in the logic. We tend to trust them, over some AI, in part because we, or society, can punish them if they get things wrong...because they can suffer. Deep Learning AI's that can explain their decisions are being developed. But the explanation is generated by a kind of auxiliary neural net within the DL whose only purpose is to generate the explanations. I think this is probably close ot how human brains work...the "explanation" of something intuited is a mixture of inference and confabulation. So I'm not sure how useful these DL explanations will be. Brent On 7/26/2018 11:30 PM, Gary Snethen wrote:

Bill,

You are touching on one of the open challenges of deep learning. Deep learning can solve a wide variety of problems through artificial intuition, but deep learning models cannot yet show their work. They don't reason through problems as a series of logical steps. Because of this, we cannot yet rely on their results in high stakes (and highly litigious) fields like medicine, wealth management, law enforcement, etc. even if we believe they are likely to be right more often than an expert performing the same task.

In the case of AlphaGo Zero, the algorithm is actually a combination of a deep neural network and a Monte Carlo tree search algorithm. For each and every actual move, the algorithm spends several seconds exploring hundreds of potential moves chains, with priorities for those moves driven by the neural network. This is similar to the way a traditional computer chess program works, except that the neural network provides intuition for which lines of play are the most important to explore. In this way, it emulates the way a grand-master chess player thinks -- intuitively selecting only those avenues that have a good probability of being fruitful and exploring those avenues in some depth.

Due to its Monte Carlo tree search design, AlphaGo Zero can be made to "explain" its moves by showing the variations it explored, how it would have handled different responses and what value it would have given to the board position at each step. However, it wouldn't be able to explain why it considered a particular move better than another or why it felt one board position was somewhat better than another.

---Gary

On Thu, Jul 26, 2018 at 8:05 PM Bill Gosper <billgosper@gmail.com> wrote:

...
Gary, thanks. But for a couple of minutes, I was wondering what the hell was min EYE go. There's a huge gap yet to be bridged. How can they know Alpha Go has "a sophisticated understanding of Go concepts", unless it can articulate and communicate them? It seems to me its knowledge amounts to millions of special cases, and conceivably retains a bug or two. --Bill

On Thu, Jul 26, 2018 at 4:30 PM Gary Snethen <gsnethen@gmail.com> wrote:

...
Bill,

I sat in on a presentation today by one of the creators of Minigo, a deep learning go program that is based on the AlphaGo Zero paper.

As a logo for Minigo, the creators chose a picture of a smiling robot happily falling off a tall ladder. The reason for this choice is that while Minigo can play strong games of go, it occasionally gets suckered into chasing a long ladder that results in a loss.

For those who don't know, a ladder in the game of go is a series of moves that repeat and propagate across the board, as each side tries to outrun the other: https://en.wikipedia.org/wiki/Ladder_(Go)

To correctly predict the outcome of continuing a ladder, each player must look many moves ahead and thus it's a "deep position" similar to the chess positions you mentioned. Although deep, ladders are simple linear sequences with no branching and thus humans quickly learn how to predict the outcome.

After seeing the presentation, I read the original AlphaGo Zero paper and found this (*emphasis* mine):

*"*AlphaGo Zero rapidly progressed from entirely random moves towards a sophisticated understanding of Go concepts including fuseki (opening), tesuji (tactics), life-and-death, ko (repeated board situations), yose (endgame), capturing races, sente (initiative), shape, influence and territory, all discovered from first principles.* Surprisingly, shicho (“ladder” capture sequences that may span the whole board) – one of the first elements of Go knowledge learned by humans – were only understood by AlphaGo Zero much later in training. * "

This supports the argument that the AlphaGo Zero algorithm is weak at deep positions, particularly when they haven't been thoroughly explored by the algorithm previously.

---Gary

On Mon, Jul 16, 2018 at 10:53 AM Gary Snethen <gsnethen@gmail.com> wrote:

...
I think Alpha Zero would fare poorly in these situations unless it trained on them directly.

The current deep learning algorithms are good at emulating intuition and perception, but aren't good at deep analysis and executive planning, which is what's required for board positions like these.

---Gary

On Mon, Jul 16, 2018 at 2:41 AM Bill Gosper <billgosper@gmail.com> wrote:

...
Does anybody know how Alpha Zero would fare on RKG's endgame composition in https://en.wikipedia.org/wiki/Richard_K._Guy ? Put another way, how would RKG's intended solution fare against Alpha Zero? Similarly, can Alpha Zero win those computer-generated "hundred move rule" positions?

https://www.chess.com/forum/view/fun-with-chess/longest-mate-official---mate... --rwg

-- Sent from my iPhone

math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

Reply

Sign in to reply online Use email software

Fred Lunnon

2:02 p.m.

Folklore relates that historical chess world-champion Alexander Alekhine was (at least on occasion) incapable of explaining the reasoning behind his own strong moves, offering instead explanations which subsequent analysis established were simply bogus. While this anecdote might be apocryphal, my own experience convinces me that conscious attention to a nontrivial problem plays very little part in its solution, other than to direct a far more powerful parallel processor hidden in the subconscious mind to perform the actual computational heavy-lifting. Like early programmers, we can make the decision to assemble our data and submit it to the batch-processor. But once that is done, the best course of (in)action is obey the operators and get the *&^%$#@! out of the computer-room. Left to its own devices, when it is good and ready the Delphic oracle within may disgorge the solution in a console lightning-flash and a thunder of lineprinter. Or may not, as the case may be. Whatever, any notion that continuing feeble rationalisation can have an effective impact on the proceedings is mere delusion. Fred Lunnon On 7/27/18, Brent Meeker <meekerdb@verizon.net> wrote:

One of the interesting things about AI is what it tells us about ourselves. The examples you give in which we don't trust DLs, because they can't provide a chain of logical inference to their conclusions, are not entirely convincing when you reflect that the "chain of logical inference" that cops or doctors or financial planners provide is often made up after the fact, has crucial intuitive choices in it, and has gaps in the logic. We tend to trust them, over some AI, in part because we, or society, can punish them if they get things wrong...because they can suffer.

Deep Learning AI's that can explain their decisions are being developed. But the explanation is generated by a kind of auxiliary neural net within the DL whose only purpose is to generate the explanations. I think this is probably close ot how human brains work...the "explanation" of something intuited is a mixture of inference and confabulation. So I'm not sure how useful these DL explanations will be.

Brent

On 7/26/2018 11:30 PM, Gary Snethen wrote:

...
Bill,

You are touching on one of the open challenges of deep learning. Deep learning can solve a wide variety of problems through artificial intuition, but deep learning models cannot yet show their work. They don't reason through problems as a series of logical steps. Because of this, we cannot yet rely on their results in high stakes (and highly litigious) fields like medicine, wealth management, law enforcement, etc. even if we believe they are likely to be right more often than an expert performing the same task.

In the case of AlphaGo Zero, the algorithm is actually a combination of a deep neural network and a Monte Carlo tree search algorithm. For each and every actual move, the algorithm spends several seconds exploring hundreds of potential moves chains, with priorities for those moves driven by the neural network. This is similar to the way a traditional computer chess program works, except that the neural network provides intuition for which lines of play are the most important to explore. In this way, it emulates the way a grand-master chess player thinks -- intuitively selecting only those avenues that have a good probability of being fruitful and exploring those avenues in some depth.

Due to its Monte Carlo tree search design, AlphaGo Zero can be made to "explain" its moves by showing the variations it explored, how it would have handled different responses and what value it would have given to the board position at each step. However, it wouldn't be able to explain why it considered a particular move better than another or why it felt one board position was somewhat better than another.

---Gary

On Thu, Jul 26, 2018 at 8:05 PM Bill Gosper <billgosper@gmail.com> wrote:

...
Gary, thanks. But for a couple of minutes, I was wondering what the hell was min EYE go. There's a huge gap yet to be bridged. How can they know Alpha Go has "a sophisticated understanding of Go concepts", unless it can articulate and communicate them? It seems to me its knowledge amounts to millions of special cases, and conceivably retains a bug or two. --Bill

On Thu, Jul 26, 2018 at 4:30 PM Gary Snethen <gsnethen@gmail.com> wrote:

...
Bill,

I sat in on a presentation today by one of the creators of Minigo, a deep learning go program that is based on the AlphaGo Zero paper.

As a logo for Minigo, the creators chose a picture of a smiling robot happily falling off a tall ladder. The reason for this choice is that while Minigo can play strong games of go, it occasionally gets suckered into chasing a long ladder that results in a loss.

For those who don't know, a ladder in the game of go is a series of moves that repeat and propagate across the board, as each side tries to outrun the other: https://en.wikipedia.org/wiki/Ladder_(Go)

To correctly predict the outcome of continuing a ladder, each player must look many moves ahead and thus it's a "deep position" similar to the chess positions you mentioned. Although deep, ladders are simple linear sequences with no branching and thus humans quickly learn how to predict the outcome.

After seeing the presentation, I read the original AlphaGo Zero paper and found this (*emphasis* mine):

*"*AlphaGo Zero rapidly progressed from entirely random moves towards a sophisticated understanding of Go concepts including fuseki (opening), tesuji (tactics), life-and-death, ko (repeated board situations), yose (endgame), capturing races, sente (initiative), shape, influence and territory, all discovered from first principles.* Surprisingly, shicho (“ladder” capture sequences that may span the whole board) – one of the first elements of Go knowledge learned by humans – were only understood by AlphaGo Zero much later in training. * "

This supports the argument that the AlphaGo Zero algorithm is weak at deep positions, particularly when they haven't been thoroughly explored by the algorithm previously.

---Gary

On Mon, Jul 16, 2018 at 10:53 AM Gary Snethen <gsnethen@gmail.com> wrote:

...
I think Alpha Zero would fare poorly in these situations unless it trained on them directly.

The current deep learning algorithms are good at emulating intuition and perception, but aren't good at deep analysis and executive planning, which is what's required for board positions like these.

---Gary

On Mon, Jul 16, 2018 at 2:41 AM Bill Gosper <billgosper@gmail.com> wrote:

...
Does anybody know how Alpha Zero would fare on RKG's endgame composition in https://en.wikipedia.org/wiki/Richard_K._Guy ? Put another way, how would RKG's intended solution fare against Alpha Zero? Similarly, can Alpha Zero win those computer-generated "hundred move rule" positions?

https://www.chess.com/forum/view/fun-with-chess/longest-mate-official---mate... --rwg

-- Sent from my iPhone

math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

Reply

Sign in to reply online Use email software

Cris Moore

1:33 a.m.

Ladder positions are one way to prove that Go is PSPACE-hard - in other words, that telling whether you have a winning strategy can be at least as hard as evaluating quantified Boolean formulas with many levels of “for alls” and “there exists". In positions like these, there is no way to avoid a lot of logical computation. On the other hand, these “logically deep” positions seem to be very rare in typical play. A friend of mine and I considered another game whhich we think humans are much better than AlphaGo-like programs: Take turns putting Go stones on the board. The color of the stones doesn’t matter. No stone can be placed within a Knight’s move of any other. The first player with nowhere to play loses. While playing this optimally from an arbitrary initial position is probably hard, from an empty initial board there’s a classic symmetric strategy for the first player if the height and width of the board is odd (and for the second player if either the height or width is even). This is the kind of “concept formation” that, to my mind, AlphaGo can’t do. - Cris

On Jul 26, 2018, at 5:30 PM, Gary Snethen <gsnethen@gmail.com> wrote:

Bill,

I sat in on a presentation today by one of the creators of Minigo, a deep learning go program that is based on the AlphaGo Zero paper.

As a logo for Minigo, the creators chose a picture of a smiling robot happily falling off a tall ladder. The reason for this choice is that while Minigo can play strong games of go, it occasionally gets suckered into chasing a long ladder that results in a loss.

For those who don't know, a ladder in the game of go is a series of moves that repeat and propagate across the board, as each side tries to outrun the other: https://en.wikipedia.org/wiki/Ladder_(Go)

To correctly predict the outcome of continuing a ladder, each player must look many moves ahead and thus it's a "deep position" similar to the chess positions you mentioned. Although deep, ladders are simple linear sequences with no branching and thus humans quickly learn how to predict the outcome.

After seeing the presentation, I read the original AlphaGo Zero paper and found this (*emphasis* mine):

*"*AlphaGo Zero rapidly progressed from entirely random moves towards a sophisticated understanding of Go concepts including fuseki (opening), tesuji (tactics), life-and-death, ko (repeated board situations), yose (endgame), capturing races, sente (initiative), shape, influence and territory, all discovered from first principles.* Surprisingly, shicho (“ladder” capture sequences that may span the whole board) – one of the first elements of Go knowledge learned by humans – were only understood by AlphaGo Zero much later in training. * "

This supports the argument that the AlphaGo Zero algorithm is weak at deep positions, particularly when they haven't been thoroughly explored by the algorithm previously.

---Gary

On Mon, Jul 16, 2018 at 10:53 AM Gary Snethen <gsnethen@gmail.com> wrote:

...
I think Alpha Zero would fare poorly in these situations unless it trained on them directly.

The current deep learning algorithms are good at emulating intuition and perception, but aren't good at deep analysis and executive planning, which is what's required for board positions like these.

---Gary

On Mon, Jul 16, 2018 at 2:41 AM Bill Gosper <billgosper@gmail.com> wrote:

...
Does anybody know how Alpha Zero would fare on RKG's endgame composition in https://en.wikipedia.org/wiki/Richard_K._Guy ? Put another way, how would RKG's intended solution fare against Alpha Zero? Similarly, can Alpha Zero win those computer-generated "hundred move rule" positions?

https://www.chess.com/forum/view/fun-with-chess/longest-mate-official---mate... --rwg

-- Sent from my iPhone

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

Reply

Sign in to reply online Use email software

2674

Age (days ago)

2685

Last active (days ago)

Download

8 comments

6 participants

tags

participants (6)

Allan Wechsler
Bill Gosper
Brent Meeker
Cris Moore
Fred Lunnon
Gary Snethen