ast_toolbox.rewards.ast_reward module

class ast_toolbox.rewards.ast_reward.ASTReward[source]

Bases: object

Function to calculate the rewards for timesteps when optimizing AST solver policies.

give_reward(action, **kwargs)[source]

Returns the reward for a given time step.

Parameters:
  • action (array_like) – Action taken by the AST solver.
  • kwargs – Accepts relevant info for computing the reward.
Returns:

reward (float) – Reward based on the previous action.