A method of collaborative decision making using action suggestions by using the agent's policy to estimate the distribution over suggestions and treating a suggested action as an observation of the environment to update the agent's belief. -
View it on GitHub