ast_toolbox.mcts.AdaptiveStressTestingBlindValue module

class ast_toolbox.mcts.AdaptiveStressTestingBlindValue.AdaptiveStressTestBV(**kwargs)[source]

Bases: ast_toolbox.mcts.AdaptiveStressTesting.AdaptiveStressTest

The AST wrapper for MCTS using the Blind Value exploration [1].

Parameters:kwargs – Keyword arguments passed to ast_toolbox.mcts.AdaptiveStressTesting.AdaptiveStressTest

References

[1]Couetoux, Adrien, Hassen Doghmen, and Olivier Teytaud. “Improving the exploration in upper confidence trees.” International Conference on Learning and Intelligent Optimization. Springer, Berlin, Heidelberg, 2012.
explore_action(s, tree)[source]

Sample an action for the exploration using Blind Value.

Parameters:
Returns:

action (ast_toolbox.mcts.AdaptiveStressTesting.ASTAction) – The sampled action.

getBV(y, rho, A, UCB)[source]

Calculate the Blind Value for the candidate action y

Parameters:
  • y (numpy.ndarry) – The candidate action.
  • rho (float) – The standard deviation ratio.
  • A (list[ast_toolbox.mcts.AdaptiveStressTesting.ASTAction]) – The list of the explored AST actions
  • UCB (dict) – The dictionary containing the upper confidence bound for each explored action in the state node.
Returns:

BV (float) – The blind value.

getDistance(a, b)[source]

Get the (L2) distance between two actions.

Parameters:
  • a (numpy.ndarry) – The first action.
  • b (numpy.ndarry) – The second action.
Returns:

distance (float) – The L2 distance between a and b.

getUCB(s)[source]

Get the upper confidnece bound for the expected return for evary actions that has been explored at the state.

Parameters:s (ast_toolbox.MCTSdpw.StateNode) – The state node in the searching tree
Returns:UCB (dict) – The dictionary containing the upper confidence bound for each explored action in the state node.