Replaced policy gradient algorith with SCAN 2D