ast_toolbox.mcts.AdaptiveStressTesting module¶

class ast_toolbox.mcts.AdaptiveStressTesting.ASTAction(action)[source]¶

Bases: object

Get the true action.

Returns:	action – The true actions used in the env.

class ast_toolbox.mcts.AdaptiveStressTesting.ASTParams(max_steps, log_interval, log_tabular, log_dir=None, n_itr=100)[source]¶

Bases: object

Structure that stores internal parameters for AST.

Parameters:	max_steps (int, optional) – The maximum search depth.

class ast_toolbox.mcts.AdaptiveStressTesting.ASTState(t_index, parent, action)[source]¶

Bases: object

The AST state.

Parameters:	t_index (int) – The index of the timestep. parent (`ast_toolbox.mcts.AdaptiveStressTesting.ASTState`) – The parent state. action (`ast_toolbox.mcts.AdaptiveStressTesting.ASTAction`) – The action leading to this state.

class ast_toolbox.mcts.AdaptiveStressTesting.AdaptiveStressTest(p, env, top_paths)[source]¶

Bases: object

The AST wrapper for MCTS using the actions in env.action_space.

Parameters:	p (`ast_toolbox.mcts.AdaptiveStressTesting.ASTParams`) – The AST parameters env (`ast_toolbox.envs.go_explore_ast_env.GoExploreASTEnv`.) – The environment. top_paths (`ast_toolbox.mcts.BoundedPriorityQueues`, optional) – The bounded priority queue to store top-rewarded trajectories.

explore_action(s, tree)[source]¶

Randomly sample an action for the exploration.

Parameters:	s (`ast_toolbox.mcts.AdaptiveStressTesting.ASTState`) – The current state. tree (dict) – The searching tree.
Returns:	action (`ast_toolbox.mcts.AdaptiveStressTesting.ASTAction`) – The sampled action.

get_reward()[source]¶

Get the current AST reward.

Returns:	reward (bool) – The AST reward.

initialize()[source]¶

Initialize training variables.

Returns:	env_reset – The reset result from the env.

isterminal()[source]¶

Check whether the current path is finished.

Returns:	isterinal (bool) – Whether the current path is finished.

random_action()[source]¶

Randomly sample an action for the rollout.

Returns:	action (`ast_toolbox.mcts.AdaptiveStressTesting.ASTAction`) – The sampled action.

transition_model()[source]¶

Generate the transition model used in MCTS.

Returns:	transition_model (`ast_toolbox.mcts.MDP.TransitionModel`) – The transition model.

update(action)[source]¶

Update the environment as well as the assosiated parameters.

Parameters:	action (`ast_toolbox.mcts.AdaptiveStressTesting.ASTAction`) – The AST action.
Returns:	obs (`numpy.ndarry`) – The observation from the env step. reward (float) – The reward from the env step. done (bool) – The terminal indicator from the env step. info (dict) – The env info from the env step.

ast_toolbox.mcts.AdaptiveStressTesting.get_action_sequence(s)[source]¶

Get the action sequence that leads to the state.

Parameters:	s (`ast_toolbox.mcts.AdaptiveStressTesting.ASTState`) – The target state.
Returns:	actions (list[`ast_toolbox.mcts.AdaptiveStressTesting.ASTAction`]) – The action sequences leading to the target state.