Visual reasoning for robotics manipulation