Related Work

Next: Higher order statistics in Up: Introduction Previous: Game of Y

Related Work

Abramson90mc proposed the expected-outcome heuristic for game state evaluation. The value of a game state is the expected outcome of the game given random play from that on. The value can be used to evaluate leaf nodes in a game-tree that has the current state as the root node and possible moves as children of the node. The expected-outcome heuristic applies to a large number of games with or without randomness or hidden information and with one or more players. It is best applicable to games that always end after a reasonable number of moves, such as the games of Hex and Y. Expected outcome heuristic is also known as roll-out analysis or play-out analysis.

In many planning problems and games, the order of moves or actions is not always important. This applies especially in games of Hex and Y where the order does not have any role as long as the same moves are made. In Monte Carlo playing, unimportance of the order means that the all-moves-as-first heuristic bruegmann93 works well: After a game has been played out, not only the evaluation of the first move is changed, but all the subsequent moves are handled as they were the first move. Thus, in the games of Hex and Y, instead of playing the game out, one can just fill the empty cells by pieces of random colour. Sampling colours for the empty cells then corresponds to Monte Carlo playing. Those cells that were filled by the winning player, are important.

All-moves-as-first heuristic is very good in reducing the number of random samples compared to basic expected outcome, but unfortunately it limits the lookahead into just one move in some sense. Bouzy05tree proposed a hybrid between expected-outcome and all-moves-as-first heuristics. A shallow game tree is made where each leaf is evaluated using the all-moves-as-first heuristic.

Muller99decomposition proposed decomposition search with an application to Go endgames. The idea is that the game can be decomposed into a sum of subgames which are somewhat independent of each other. Because the subgames are small, the full game tree can be constructed for each of them, including local pass moves. Then, many of the moves can be pruned for instance because they are dominated by other moves in the same subgame. The global search is thus sped up a lot, allowing perfect solution of end-games of dozens of moves. A similar approach is suggested to be used heuristically for middle-game where the division into subgames is not yet strict.

Already bruegmann93 suggested an extension of the all-moves-as-first heuristic into an higher order heuristic, where not just the value of each move is evaluated but the value of making a move after some other moves have been made. Time was not ripe for higher order considerations for computational resources. If there are moves available, one would have to collect enough statistics for combinations of moves. Now, thirteen years later, it would still be a waste of resources to try to uniformly estimate all combinations.

Inductive logic programming (ILP) Muggleton94ilp involves finding patterns of increasing size in data. A pattern in this case is a set of moves (in addition to the ones that have been already made) and the data are the random games. One can estimate the value of a move given that the pattern has already been played. New patterns are created by refinement operators, in this case by adding a move to an existing pattern. By using ILP, we can make the higher-order heuristic selective, that is, we do not have to estimate all combinations of moves but only where is the number of patterns.

Next: Higher order statistics in Up: Introduction Previous: Game of Y

Tapani Raiko 2006-09-01