zoo.examples.ray.rl_pong package¶
Submodules¶
zoo.examples.ray.rl_pong.rl_pong module¶
-
zoo.examples.ray.rl_pong.rl_pong.discount_rewards(r)[source]¶ take 1D float array of rewards and compute discounted reward
-
zoo.examples.ray.rl_pong.rl_pong.policy_backward(eph, epx, epdlogp, model)[source]¶ backward pass. (eph is array of intermediate hidden states)