ast_toolbox.mcts.MDP module¶
-
class
ast_toolbox.mcts.MDP.
TransitionModel
(getInitialState, getNextState, isEndState, maxSteps, goToState)[source]¶ Bases:
object
The wrapper for the transitin model used in the tree search.
Parameters: - getInitialState (function) – getInitialState() returns the initial AST state.
- getNextState (function) – getNextState(s, a) returns the next state and the reward.
- isEndState (function) – isEndState(s) returns whether s is a terminal state.
- maxSteps (int) – The maximum path length.
- goToState (function) – goToState(s) sets the simulator to the target state s.
-
ast_toolbox.mcts.MDP.
simulate
(model, p, policy, verbose=False, sleeptime=0.0)[source]¶ Simulate the environment model using the policy and the parameter p.
Parameters: - model (
ast_toolbox.mcts.MDP.TransitionModel
) – The environment model. - p – The extra paramters needed by the policy.
- policy (function) – policy(p, s) returns the next action.
- verbose (bool, optional) – Whether to logging simulating information.
- sleeptime (float, optional) – The pause time between each step.
Returns: - cum_reward (float) – The cumulative reward.
- actions (list) – The action sequence of the path.
- model (