Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📝 The tricks in Learn to optimize in TNCO sycamore #135

Open
Yonv1943 opened this issue May 30, 2023 · 0 comments
Open

📝 The tricks in Learn to optimize in TNCO sycamore #135

Yonv1943 opened this issue May 30, 2023 · 0 comments
Labels
discussion code understanding

Comments

@Yonv1943
Copy link
Collaborator

下面 theta 表示 TNCO任务的“解”,它是一个表示量子电路收缩顺序的tensor, 计算 theta.argsort() 就能获得有序的 edge_id,表示依次收缩某一条边。

  1. 维持了两个 ReplayBuffer,一个保存了模型实时迭代产生的theta,另一个保存了得分较好的theta。这样让得分好的theta不至于被 ReplayBuffer 的 FIFO 规则删掉。

计算 keep_score
https://github.com/AI4Finance-Foundation/ElegantRL_Solver/blob/41a58d0ecb9daeddfa635d19a3741b0a29162342/rlsolver/rlsolver_learn2opt/tensor_train/TNCO_H2O.py#L418-L419

根据 keep_score 找出需要保存到 buffer0 的 theta
https://github.com/AI4Finance-Foundation/ElegantRL_Solver/blob/41a58d0ecb9daeddfa635d19a3741b0a29162342/rlsolver/rlsolver_learn2opt/tensor_train/TNCO_H2O.py#L372-L375

  1. 从更好的解附近开始搜索

historical_theta 是一个得分好的theta,我们对它加上噪声并在它附近开始迭代
https://github.com/AI4Finance-Foundation/ElegantRL_Solver/blob/41a58d0ecb9daeddfa635d19a3741b0a29162342/rlsolver/rlsolver_learn2opt/tensor_train/TNCO_H2O.py#L402-L408

我们也从 保存了较好的 theta 的 buffer0 里随机选出的theta 开始迭代
https://github.com/AI4Finance-Foundation/ElegantRL_Solver/blob/41a58d0ecb9daeddfa635d19a3741b0a29162342/rlsolver/rlsolver_learn2opt/tensor_train/TNCO_H2O.py#L410-L419

  1. 在训练迭代器之后,使用迭代器进行推理

可以看到

@YangletLiu YangletLiu added the discussion code understanding label May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion code understanding
Projects
None yet
Development

No branches or pull requests

2 participants