Testing our architectures
We tested our proposed architectures on multiple tasks, including the puzzle game Sokoban and a spaceship navigation game. Both games require forward planning and reasoning, making them the perfect environment to test our agents’ abilities.
-
In the spaceship task, the agent must stabilise a craft by activating its thrusters a fixed number of times. It must contend with the gravitational pull of several planets, making it a highly nonlinear complex continuous control task.
To limit trial-and-error for both tasks, each level is procedurally generated and the agent can only try it once; this encourages the agent to try different strategies ‘in its head’ before testing them in the real environment.
Source: https://deepmind.com/blog/article/agents-imagine-and-plan