ast_toolbox.rewards.ast_reward module¶

class ast_toolbox.rewards.ast_reward.ASTReward[source]¶

Bases: object

Function to calculate the rewards for timesteps when optimizing AST solver policies.

give_reward(action, **kwargs)[source]¶

Returns the reward for a given time step.

Parameters:	action (array_like) – Action taken by the AST solver. kwargs – Accepts relevant info for computing the reward.
Returns:	reward (float) – Reward based on the previous action.