ast_toolbox.mcts.AdaptiveStressTestingBlindValue module¶

class ast_toolbox.mcts.AdaptiveStressTestingBlindValue.AdaptiveStressTestBV(**kwargs)[source]¶

The AST wrapper for MCTS using the Blind Value exploration [1].

Parameters:	kwargs – Keyword arguments passed to ast_toolbox.mcts.AdaptiveStressTesting.AdaptiveStressTest

References

[1]	Couetoux, Adrien, Hassen Doghmen, and Olivier Teytaud. “Improving the exploration in upper confidence trees.” International Conference on Learning and Intelligent Optimization. Springer, Berlin, Heidelberg, 2012.

explore_action(s, tree)[source]¶

Sample an action for the exploration using Blind Value.

Parameters:	s (`ast_toolbox.mcts.AdaptiveStressTesting.ASTState`) – The current state. tree (dict) – The searching tree.
Returns:	action (`ast_toolbox.mcts.AdaptiveStressTesting.ASTAction`) – The sampled action.

getBV(y, rho, A, UCB)[source]¶

Calculate the Blind Value for the candidate action y

Parameters:	y (`numpy.ndarry`) – The candidate action. rho (float) – The standard deviation ratio. A (list[`ast_toolbox.mcts.AdaptiveStressTesting.ASTAction`]) – The list of the explored AST actions UCB (dict) – The dictionary containing the upper confidence bound for each explored action in the state node.
Returns:	BV (float) – The blind value.

getDistance(a, b)[source]¶

Get the (L2) distance between two actions.

Parameters:	a (`numpy.ndarry`) – The first action. b (`numpy.ndarry`) – The second action.
Returns:	distance (float) – The L2 distance between a and b.

getUCB(s)[source]¶

Get the upper confidnece bound for the expected return for evary actions that has been explored at the state.

Parameters:	s (`ast_toolbox.MCTSdpw.StateNode`) – The state node in the searching tree
Returns:	UCB (dict) – The dictionary containing the upper confidence bound for each explored action in the state node.