next up previous
Next: Experiments Up: Higher order statistics in Previous: Final move selection

Summary of the algorithm

Given a state $ s(0)$ select a move $ m(0)$ by


1: Patterns 
$ P=\{\{\}\}$, # of play out $ k=0$ 

2: Play-out depth $ j=0$
3: Move $ m(j)=\arg \max_{m} h_h(m \mid p)+$noise
4: Make move $ m(j)$ in state $ s(j)$ to get new $ s(j+1)$
5: Increase $ j$ by one
6: If $ s(j)$ not finished, loop to 3
7: Increase $ k$ by one
8: Save moves $ m(\cdot)$ and the result of $ s(j)$ in $ c_{k}$
9: Loop to 2 for some time
10: Add new patterns to $ P$
11: Loop to 2 for some time
12: $ m(0) = \arg \min_{m_1} \max_{m_0} h_h(m_0\mid \{m_1\})$
On line 3, the moves of the pattern $ p \in P$ and $ p \subset
\{m(0),\dots,m(i-1)\}$ and there must be no $ q \in P$ such that $ p
\subset q \subset \{m(0),\dots,m(i-1)\}$. On line 10, the candidate patterns are all patterns in $ P$ extended by one move, and they are accepted if the particular move has been selected by that pattern more than some threshold number of times on line 3.



Tapani Raiko 2006-09-01