A method of collaborative decision making using action suggestions by using the agent's policy to estimate the distribution over suggestions and treating a suggested action as an observation of the environment to update the agent's belief. - View it on GitHub
Star
10
Rank
1287664