Train and Test Policies in OpenAI Gym Environments

This week I played around with OpenAI Gym. Specifically, I explored most environments in Gym, tested a random policy, deterministic heuristic policy and train an optimal policy using spinup PPO, rendered the resulting policy in videos. Now I can use the existing RL algorithm functions to solve tasks and present results. Next week, I will delve deeper into those algorithms and implement some of my own. See my CoLab python notebook for those results.