ast_toolbox.mcts.MCTSdpw module

class ast_toolbox.mcts.MCTSdpw.DPWModel(model, getAction, getNextAction)[source]

Bases: object

The model used in the tree search.

Parameters:
  • model (ast_toolbox.mcts.MDP.TransitionModel) – The transition model.
  • getAction (function) – getAction(s, tree) returns the action used in rollout.
  • getNextAction (function) – getNextAction(s, tree) returns the action used in exploration.
class ast_toolbox.mcts.MCTSdpw.DPWParams(d, gamma, ec, n, k, alpha, clear_nodes)[source]

Bases: object

Structure that stores the parameters for the MCTS with DPW.

Parameters:
  • d (int) – The maximum searching depth.
  • gamma (float) – The discount factor.
  • ec (float) – The weight for the exploration bonus.
  • n (int) – The mximum number of iterations.
  • k (float) – The constraint parameter used in DPW: |N(s,a)|<=kN(s)^alpha.
  • alpha (float) – The constraint parameter used in DPW: |N(s,a)|<=kN(s)^alpha.
  • clear_nodes (bool) – Whether to clear redundant nodes in tree. Set it to True for saving memoray. Set it to False to better tree plotting.
class ast_toolbox.mcts.MCTSdpw.DPWTree(p, f)[source]

Bases: object

The structure storing the seaching tree.

class ast_toolbox.mcts.MCTSdpw.StateActionNode[source]

Bases: object

The structure representing the state-action node.

class ast_toolbox.mcts.MCTSdpw.StateActionStateNode[source]

Bases: object

The structure storing the transition state-action-state.

class ast_toolbox.mcts.MCTSdpw.StateNode[source]

Bases: object

The structure representing the state node.

ast_toolbox.mcts.MCTSdpw.rollout(tree, s, depth)[source]

Rollout from the current state s.

Parameters:
Returns:

q (float) – The estimated return.

ast_toolbox.mcts.MCTSdpw.saveBackwardState(old_s_tree, new_s_tree, s_current)[source]

Saving the s_current as well as all its predecessors in the old_s_tree into the new_s_tree.

Parameters:
Returns:

new_s_tree (dict) – The new tree.

ast_toolbox.mcts.MCTSdpw.saveForwardState(old_s_tree, new_s_tree, s)[source]

Saving the s_current as well as all its successors in the old_s_tree into the new_s_tree.

Parameters:
Returns:

new_s_tree (dict) – The new tree.

ast_toolbox.mcts.MCTSdpw.saveState(old_s_tree, s)[source]

Saving the s_current as well as all its predecessors and successors in the old_s_tree into the new_s_tree.

Parameters:
Returns:

new_s_tree (dict) – The new tree.

ast_toolbox.mcts.MCTSdpw.selectAction(tree, s, verbose=False)[source]

Run MCTS to select one action for the state s

Parameters:
Returns:

action (ast_toolbox.mcts.AdaptiveStressTesting.ASTAction) – The selected AST action.

ast_toolbox.mcts.MCTSdpw.simulate(tree, s, depth, verbose=False)[source]

Single run of the forward MCTS search.

Parameters:
Returns:

q (float) – The estimated return.