Fork of https://github.com/DartML/PPO-Stein-Control-Variate with modifications for the paper "The Mirage of Action-Dependent Baselines in Reinforcement Learning". - View it on GitHub
Star
0
Rank
12484700