ast_toolbox.mcts.AdaptiveStressTesting module¶
-
class
ast_toolbox.mcts.AdaptiveStressTesting.ASTParams(max_steps, log_interval, log_tabular, log_dir=None, n_itr=100)[source]¶ Bases:
objectStructure that stores internal parameters for AST.
Parameters: max_steps (int, optional) – The maximum search depth.
-
class
ast_toolbox.mcts.AdaptiveStressTesting.ASTState(t_index, parent, action)[source]¶ Bases:
objectThe AST state.
Parameters: - t_index (int) – The index of the timestep.
- parent (
ast_toolbox.mcts.AdaptiveStressTesting.ASTState) – The parent state. - action (
ast_toolbox.mcts.AdaptiveStressTesting.ASTAction) – The action leading to this state.
-
class
ast_toolbox.mcts.AdaptiveStressTesting.AdaptiveStressTest(p, env, top_paths)[source]¶ Bases:
objectThe AST wrapper for MCTS using the actions in env.action_space.
Parameters: - p (
ast_toolbox.mcts.AdaptiveStressTesting.ASTParams) – The AST parameters - env (
ast_toolbox.envs.go_explore_ast_env.GoExploreASTEnv.) – The environment. - top_paths (
ast_toolbox.mcts.BoundedPriorityQueues, optional) – The bounded priority queue to store top-rewarded trajectories.
-
explore_action(s, tree)[source]¶ Randomly sample an action for the exploration.
Parameters: - s (
ast_toolbox.mcts.AdaptiveStressTesting.ASTState) – The current state. - tree (dict) – The searching tree.
Returns: action (
ast_toolbox.mcts.AdaptiveStressTesting.ASTAction) – The sampled action.- s (
-
initialize()[source]¶ Initialize training variables.
Returns: env_reset – The reset result from the env.
-
isterminal()[source]¶ Check whether the current path is finished.
Returns: isterinal (bool) – Whether the current path is finished.
-
random_action()[source]¶ Randomly sample an action for the rollout.
Returns: action ( ast_toolbox.mcts.AdaptiveStressTesting.ASTAction) – The sampled action.
-
transition_model()[source]¶ Generate the transition model used in MCTS.
Returns: transition_model ( ast_toolbox.mcts.MDP.TransitionModel) – The transition model.
-
update(action)[source]¶ Update the environment as well as the assosiated parameters.
Parameters: action ( ast_toolbox.mcts.AdaptiveStressTesting.ASTAction) – The AST action.Returns: - obs (
numpy.ndarry) – The observation from the env step. - reward (float) – The reward from the env step.
- done (bool) – The terminal indicator from the env step.
- info (dict) – The env info from the env step.
- obs (
- p (
-
ast_toolbox.mcts.AdaptiveStressTesting.get_action_sequence(s)[source]¶ Get the action sequence that leads to the state.
Parameters: s ( ast_toolbox.mcts.AdaptiveStressTesting.ASTState) – The target state.Returns: actions (list[ ast_toolbox.mcts.AdaptiveStressTesting.ASTAction]) – The action sequences leading to the target state.