Skip to content

Latest commit

 

History

History
113 lines (96 loc) · 2.63 KB

Train-PCN-Models-for-the-Solver.md

File metadata and controls

113 lines (96 loc) · 2.63 KB

Train PCN Models for the Solver

This subsection introduces on the training of Proof Cost Networks.

PCN

Before launching the solver, a PCN model needs to be trained for it. To train a PCN model, we need three process:

  1. zero server
  2. actor
  3. learner

Before start training, we need a config file.

  • Config file can be generated by the following command:
./build/killallgo/killallgo_solver -gen {config file name}
  • There are some setting must be changed:
# Program
program_auto_seed=true

# Actor
actor_num_simulation=400
actor_mcts_value_rescale=true
actor_dirichlet_noise_alpha=0.2
actor_resign_threshold=-2
actor_use_random_op=false

# Zero
zero_num_games_per_iteration=2000
zero_actor_ignored_command=reset_actors

# Learner
learner_num_process=6

# Network
nn_num_input_channels=18
nn_input_channel_height=7
nn_input_channel_width=7
nn_num_hidden_channels=256
nn_hidden_channel_height=7
nn_hidden_channel_width=7
nn_num_blocks=3
nn_num_action_channels=2
nn_action_size=50
nn_discrete_value_size=200
nn_board_evaluation_scalar=0

# Environment
env_killallgo_ko_rule=situational

Now we can execute the three process metioned above.

  • Zero Server:
./scripts/zero-server.sh killallgo {config file} {model storage path} {iteration}
# ex
./scripts/zero-server.sh killallgo pcn.cfg /7x7Killallgo/Training/pcn 200
  • Actor:
./scripts/zero-worker.sh killallgo {zero server host} {zero server port} sp
# ex
./scripts/zero-worker.sh killallgo NV27 9999 sp
  • Learner:
./scripts/zero-worker.sh killallgo {zero server host} {zero server port} op
# ex
./scripts/zero-worker.sh killallgo NV27 9999 op

Gumbel PCN

To train a Gumbel PCN, the config file should have something to change.

  • Turn on gumbel:
actor_use_dirichlet_noise=false
actor_use_gumbel=true
actor_use_gumbel_noise=true
  • Gumbel parameters:
actor_num_simulation=32
actor_gumbel_sample_size=16
  • actor_num_simulation: MCTS simulation count
  • actor_gumbel_sample_size: Sequential halving threshold

Random Opening

To train a PCN with random opening, the config file should have something to change.

  • Turn on random opening:
actor_use_random_op=true
  • Random opening parameters:
actor_random_op_max_length=25
actor_random_op_use_softmax=true
actor_random_op_softmax_temperature=2
actor_random_op_softmax_sum_limit=0.7
  • actor_random_op_max_length: Opening max length
  • actor_random_op_softmax_temperature: Random move softmax temperature
  • actor_random_op_softmax_sum_limit: The threshold of the probability sum of the random move softmax distribution