-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate how Michi(-C) is stronger with equal number of playouts #2 #95
Labels
Comments
33 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Continuation of #5
10k playouts/turn vs Michi-C single threaded: 29.1% (55 games)
Long 10k playouts CLOP self-play run:
Removed MCTS leaf expansion delay for both michi-c and matilda.
Perhaps related to using very different policies for playouts and heuristic MC RAVE ?
winrate 39.7% (68) (10k playouts/turn, alternate colors, 7.5 komi)
winrate 45.3% (53) (same as above but with komi 5.5)
both using MC only instead of RAVE/criticality etc
winrate 60% (92)
with matilda with same RAVE urgency function
with original matilda RAVE eqiv parameter, with heuristic mc rave: 46.5% (71)
with michi equiv parameter, with heuristic mc rave: 23.9 (88)
with original matilda RAVE eqiv parameter, without heuristic mc rave: 49.5 (85)
with michi equiv parameter, without heuristic mc rave: 33.5 (182)
self play with black with Michi RAVE quality, single threaded, no expansion delay: 45.3% (75)
RAVE is working worse in matilda than in michi?
AMAF visits in matilda are stored leaf to root (a state is only influenced by transitions that appeared after); michi stores visits immediatly. Furthermore matilda always replaced later visits and michi didnt
with michi style of AMAF info and replacing:
without replacing: all terrible
self play without criticality info being saved: slightly worse
both using MC only instead of RAVE/criticality again: 54% (100)
9x9
Base single threaded without NN vs Michi, both 10k playouts, no expansion delay: 44% (722)
Baseline without RAVE: 54.7% (725)
Baseline with reduced priors+: 48.5% (526) (line2, line3, empty, line1x, line2x, line3x, corner)
Baseline on server to ensure code correctness: 42.5% (865) [1]
[1] with lineNx priors removed: 45.9 (1598) [2]
[2] without line2, line3, empty priors: 45,5 (1812)
[2] without bad play prior: 45.6 (2114)
[2] without line2, line3, empty: 46.9 (3558) [3]
[3] without near_last: 43.2% (3044)
baseline again: 44.8% (4377)
19x19 tests
baseline with conditions as [1]: ~9% (~280)
baseline with both without RAVE: ~3% (~420)
baseline without lineNx priors: ~10.3% (~340) [4]
[4] with nakade in playouts but only after captures: ~9% (~600)
13x13 to be faster
baseline with conditions as [1]: ~19% (~3025) [5]
[5] without lineNx priors:
All cleared with changes to test environment; improved Michi code, bug fixes, all matilda things reverted back, 9x9, no expansion delay, 10k
baseline: ~43% (1150)
with michi using matilda RNG: ~46% (2100) [6]
[6] repeated with a few simplifications, just to make sure: ~40% (150)
[6] with matilda pattern matching: ~59% (1275) [7]
[7] with matilda pattern matching with inverted color to play in pattern matching: ~62% (700)
back with baseline, michi pattern matching, but with mtld using mogo patterns: ~39% (1600)
[6] with 20k playouts: 46.3% (9900)[6] with 20k playouts (repeated to make sure they were using 20k: ~47% (3300)
13x13
baseline: 20k playouts, no expansion delay, using mtld rng: ~27.5 (2550) [7]
[7] without line1x, line2x, line3x: 32% (2002) [8]
lineNx priors scrapped from master
[8] without line2, line3, empty: ~30% (2400)
both without RAVE (mtld equiv param 2003, michi 3500): ~60% (~3500)
[8] without NN: ~32% (~600)
This seems to be going nowhere. Advise a rewrite of MCTS priors/playouts to use same base routines, like Pachi/Michi/Fuego/MoGo use. Perhaps the problem is it uses very different strategies in these two places.
The text was updated successfully, but these errors were encountered: