ast_toolbox.mcts.MCTSdpw module¶
-
class
ast_toolbox.mcts.MCTSdpw.
DPWModel
(model, getAction, getNextAction)[source]¶ Bases:
object
The model used in the tree search.
Parameters: - model (
ast_toolbox.mcts.MDP.TransitionModel
) – The transition model. - getAction (function) – getAction(s, tree) returns the action used in rollout.
- getNextAction (function) – getNextAction(s, tree) returns the action used in exploration.
- model (
-
class
ast_toolbox.mcts.MCTSdpw.
DPWParams
(d, gamma, ec, n, k, alpha, clear_nodes)[source]¶ Bases:
object
Structure that stores the parameters for the MCTS with DPW.
Parameters: - d (int) – The maximum searching depth.
- gamma (float) – The discount factor.
- ec (float) – The weight for the exploration bonus.
- n (int) – The mximum number of iterations.
- k (float) – The constraint parameter used in DPW: |N(s,a)|<=kN(s)^alpha.
- alpha (float) – The constraint parameter used in DPW: |N(s,a)|<=kN(s)^alpha.
- clear_nodes (bool) – Whether to clear redundant nodes in tree. Set it to True for saving memoray. Set it to False to better tree plotting.
-
class
ast_toolbox.mcts.MCTSdpw.
DPWTree
(p, f)[source]¶ Bases:
object
The structure storing the seaching tree.
-
class
ast_toolbox.mcts.MCTSdpw.
StateActionNode
[source]¶ Bases:
object
The structure representing the state-action node.
-
class
ast_toolbox.mcts.MCTSdpw.
StateActionStateNode
[source]¶ Bases:
object
The structure storing the transition state-action-state.
-
class
ast_toolbox.mcts.MCTSdpw.
StateNode
[source]¶ Bases:
object
The structure representing the state node.
-
ast_toolbox.mcts.MCTSdpw.
rollout
(tree, s, depth)[source]¶ Rollout from the current state s.
Parameters: - tree (
ast_toolbox.mcts.MCTSdpw.DPWTree
) – The seach tree. - s (
ast_toolbox.mcts.AdaptiveStressTesting.ASTState
) – The current state. - depth (int) – The maximum search depth
Returns: q (float) – The estimated return.
- tree (
-
ast_toolbox.mcts.MCTSdpw.
saveBackwardState
(old_s_tree, new_s_tree, s_current)[source]¶ Saving the s_current as well as all its predecessors in the old_s_tree into the new_s_tree.
Parameters: - old_s_tree (dict) – The old tree.
- new_s_tree (dict) – The new tree.
- s_current (
ast_toolbox.mcts.AdaptiveStressTesting.ASTState
) – The current state.
Returns: new_s_tree (dict) – The new tree.
-
ast_toolbox.mcts.MCTSdpw.
saveForwardState
(old_s_tree, new_s_tree, s)[source]¶ Saving the s_current as well as all its successors in the old_s_tree into the new_s_tree.
Parameters: - old_s_tree (dict) – The old tree.
- new_s_tree (dict) – The new tree.
- s_current (
ast_toolbox.mcts.AdaptiveStressTesting.ASTState
) – The current state.
Returns: new_s_tree (dict) – The new tree.
-
ast_toolbox.mcts.MCTSdpw.
saveState
(old_s_tree, s)[source]¶ Saving the s_current as well as all its predecessors and successors in the old_s_tree into the new_s_tree.
Parameters: - old_s_tree (dict) – The old tree.
- s (
ast_toolbox.mcts.AdaptiveStressTesting.ASTState
) – The current state.
Returns: new_s_tree (dict) – The new tree.
-
ast_toolbox.mcts.MCTSdpw.
selectAction
(tree, s, verbose=False)[source]¶ Run MCTS to select one action for the state s
Parameters: - tree (
ast_toolbox.mcts.MCTSdpw.DPWTree
) – The seach tree. - s (
ast_toolbox.mcts.AdaptiveStressTesting.ASTState
) – The current state. - verbose (bool, optional) – Where to log the seaching information.
Returns: action (ast_toolbox.mcts.AdaptiveStressTesting.ASTAction) – The selected AST action.
- tree (
-
ast_toolbox.mcts.MCTSdpw.
simulate
(tree, s, depth, verbose=False)[source]¶ Single run of the forward MCTS search.
Parameters: - tree (
ast_toolbox.mcts.MCTSdpw.DPWTree
) – The seach tree. - s (
ast_toolbox.mcts.AdaptiveStressTesting.ASTState
) – The current state. - depth (int) – The maximum search depth
- verbose (bool, optional) – Where to log the seaching information.
Returns: q (float) – The estimated return.
- tree (