ast_toolbox.mcts.AdaptiveStressTesting module¶
-
class
ast_toolbox.mcts.AdaptiveStressTesting.
ASTParams
(max_steps, log_interval, log_tabular, log_dir=None, n_itr=100)[source]¶ Bases:
object
Structure that stores internal parameters for AST.
Parameters: max_steps (int, optional) – The maximum search depth.
-
class
ast_toolbox.mcts.AdaptiveStressTesting.
ASTState
(t_index, parent, action)[source]¶ Bases:
object
The AST state.
Parameters: - t_index (int) – The index of the timestep.
- parent (
ast_toolbox.mcts.AdaptiveStressTesting.ASTState
) – The parent state. - action (
ast_toolbox.mcts.AdaptiveStressTesting.ASTAction
) – The action leading to this state.
-
class
ast_toolbox.mcts.AdaptiveStressTesting.
AdaptiveStressTest
(p, env, top_paths)[source]¶ Bases:
object
The AST wrapper for MCTS using the actions in env.action_space.
Parameters: - p (
ast_toolbox.mcts.AdaptiveStressTesting.ASTParams
) – The AST parameters - env (
ast_toolbox.envs.go_explore_ast_env.GoExploreASTEnv
.) – The environment. - top_paths (
ast_toolbox.mcts.BoundedPriorityQueues
, optional) – The bounded priority queue to store top-rewarded trajectories.
-
explore_action
(s, tree)[source]¶ Randomly sample an action for the exploration.
Parameters: - s (
ast_toolbox.mcts.AdaptiveStressTesting.ASTState
) – The current state. - tree (dict) – The searching tree.
Returns: action (
ast_toolbox.mcts.AdaptiveStressTesting.ASTAction
) – The sampled action.- s (
-
initialize
()[source]¶ Initialize training variables.
Returns: env_reset – The reset result from the env.
-
isterminal
()[source]¶ Check whether the current path is finished.
Returns: isterinal (bool) – Whether the current path is finished.
-
random_action
()[source]¶ Randomly sample an action for the rollout.
Returns: action ( ast_toolbox.mcts.AdaptiveStressTesting.ASTAction
) – The sampled action.
-
transition_model
()[source]¶ Generate the transition model used in MCTS.
Returns: transition_model ( ast_toolbox.mcts.MDP.TransitionModel
) – The transition model.
-
update
(action)[source]¶ Update the environment as well as the assosiated parameters.
Parameters: action ( ast_toolbox.mcts.AdaptiveStressTesting.ASTAction
) – The AST action.Returns: - obs (
numpy.ndarry
) – The observation from the env step. - reward (float) – The reward from the env step.
- done (bool) – The terminal indicator from the env step.
- info (dict) – The env info from the env step.
- obs (
- p (
-
ast_toolbox.mcts.AdaptiveStressTesting.
get_action_sequence
(s)[source]¶ Get the action sequence that leads to the state.
Parameters: s ( ast_toolbox.mcts.AdaptiveStressTesting.ASTState
) – The target state.Returns: actions (list[ ast_toolbox.mcts.AdaptiveStressTesting.ASTAction
]) – The action sequences leading to the target state.