ast_toolbox.rewards.example_av_reward module

An example implementation of an ASTReward for an AV validation scenario.

class ast_toolbox.rewards.example_av_reward.ExampleAVReward(num_peds=1, cov_x=0.1, cov_y=0.01, cov_sensor_noise=0.1, use_heuristic=True)[source]

Bases: ast_toolbox.rewards.ast_reward.ASTReward

An example implementation of an ASTReward for an AV validation scenario.

Parameters:
  • num_peds (int) – The number of pedestrians in the scenario.
  • cov_x (float) – Covariance of the x-acceleration.
  • cov_y (float) – Covariance of the y-acceleration.
  • cov_sensor_noise (float) – Covariance of the sensor noise.
  • use_heuristic (bool) – Whether to include a heuristic in the reward based on how close the pedestrian is to the vehicle at the end of the trajectory.
give_reward(action, **kwargs)[source]

Returns the reward for a given time step.

Parameters:
  • action (array_like) – Action taken by the AST solver.
  • kwargs – Accepts relevant info for computing the reward.
Returns:

reward (float) – Reward based on the previous action.

mahalanobis_d(action)[source]

Calculate the Mahalanobis distance [1] between the action and the mean action.

Parameters:action (array_like) – Action taken by the AST solver.
Returns:float – The Mahalanobis distance between the action and the mean action.

References

[1]Mahalanobis, Prasanta Chandra. “On the generalized distance in statistics.” National Institute of Science of India, 1936. http://library.isical.ac.in:8080/jspui/bitstream/123456789/6765/1/Vol02_1936_1_Art05-pcm.pdf